Skip to content

AI 101: a glossary

Yabble Aug 11, 2022 8:23:27 PM

We throw around terms like “deep learning” and “neural network” quite a bit in the world of AI — but what do they really mean?

Below, you’ll find a glossary of some basic AI terms that we hope will serve as a helpful reference for anyone interested in artificial intelligence and text analytics. Happy reading!

Yabble AI glossary blog image

Note: For flow, we've organized our glossary by category more than alphabet. If you're looking for a specific term, we recommend using the Ctrl+F function on your keyboard!

Artificial intelligence (AI)

The ability of a machine to display human-like intelligence or to perform human tasks. For instance: the ability of Hey Yabble to count themes and sub-themes within a dataset, just like (but 1,000 times faster than!) an expert human coder.

Machine learning

A sub-discipline of artificial intelligence. Machine learning uses data and algorithms to “teach” machines to imitate human learning. Over time, this training makes AIs smarter, faster, and more accurate. Machine learning functions based off labelled data; in other words, its algorithms utilize input data that’s been structured and organized prior to use.

Neural network

A sub-discipline of machine learning. A neural network is a machine or system designed to mimic the human brain. The way these networks are built enables them to perform functions much like a brain, including pattern recognition and problem-solving. Yabble uses OpenAI’s cutting-edge GPT-3 neural network as part of our unique, industry-leading technology.

Deep learning

A sub-discipline of machine learning. Deep learning is the component of AI that allows it to make predictions with minute accuracy and to automate many tasks that once only humans could perform. Unlike machine learning algorithms ⁠— which use pre-processed datasets to make predictions ⁠— deep learning algorithms are able to analyze unstructured data, including open-text responses and images. Deep learning is a core aspect of the Yabble technology.

Proprietary algorithm

A custom-built algorithm that interacts with a neural network to perform a specific set of tasks. Proprietary algorithms enable AIs to successfully complete tasks that they’re unable to do on their own; for instance, Yabble’s proprietary algorithms interact with OpenAI’s GPT-3 neural network to produce Yabble’s unique technology, which powers our unique Hey Yabble insights generator. Because of these algorithms, the Yabble technology is unique to our business and unavailable elsewhere on the market.

Human intelligence (HI)

The intellectual capacity of humans. Human intelligence refers to our collective ability to reason, problem-solve, learn, and perform complex cognitive tasks such as grasping abstract concepts.

Natural language processing (NLP)

A sub-discipline of AI. Natural language processing, or NLP, attempts to train machines to understand text and speech like humans can. NLP uses machine learning, deep learning, statistical models, and computational linguistics to process, analyze, and understand the intent and sentiment of language. Natural language understanding (NLU) and natural language generation (NLG) are components of NLP. All three of these are critical to the functionality of the Yabble tools.

Natural language understanding (NLU)

A sub-discipline of NLP. Natural language understanding, or NLU, is the part of NLP that enables machines to ingest and comprehend human language. NLU allows an AI to accurately process syntax (grammar and sentence structure), semantics (the meaning, or sentiment, behind the words), and context, ultimately allowing it to establish relationships between words and phrases and understand the full meaning behind text or speech.

Natural language generation (NLG)

A sub-discipline of NLP. Where NLU allows machines to understand language, natural language generation (or NLG) allows them to communicate back to us in that language. NLG is the piece of the Yabble platform that enables Hey Yabble to produce accurate, readable summaries of unstructured text data.

Ensemble modelling

A machine learning technique. Ensemble modelling is all about achieving optimal predictive results; it combines several base models to produce one overarching model that’s more powerful and accurate.

Unstructured text data

Datasets without a predefined structure. Unstructured data represents the vast majority of data in the world and has historically been very labor-intensive to process and analyze. It’s typically text-heavy, coming from sources such as social media comments or in-depth interviews. Structured data is the other side of the coin here; unlike unstructured, structured data is typically numerical, easily analyzed, and contained in an organized format like an Excel spreadsheet.

Corpus data

A foundational component of NLP. Corpus data is a collection of text or audio organized into a dataset. It contains both text and speech data that can be used to train machine learning systems. The corpus data is used as an indication of what the optimal AI output should be. The larger and more high-quality the corpus data, the better.

Sentiment analysis

A function of AI that contextualizes and categorizes the overall sentiment of a dataset. This output is invaluable for businesses in determining perception of their brand and decisions.

Semantic search

The process that AIs use to determine the intent and context of a given search. This determines which models the machine uses to analyze your question and ensures the most accurate results.

Word vectors

Word vectors are attempts to numerically express the meaning of a word. They’re used in embeddings (definitions below) to streamline the analysis process of an AI.


An application of deep learning. In technical terms, embeddings are “dense numerical representations of real-world objects and relationships.” In practical terms, they’re used within neural networks to categorize and relate results generated from a dataset. Basically, embeddings enable similar datapoints to cluster together so that an AI can group similar results together in its output. Hey Yabble, for instance, uses embeddings to generate summaries of related themes and sub-themes.

Text embeddings

Also called word embeddings, text embeddings are numeric vector inputs that represent words. They allow AIs to assign similar representation to words with similar meanings. Text embeddings are inputted into machine learning models and help train them to use words to better predict the context and words around them. For example: words like “cat” and “dog” should be grouped more closely together than “cat” and “car.”

Topic analysis

An NLP technique. In topic analysis, an AI organizes and understands large datasets by assigning “labels” or categories to each piece of data (e.g. each individual survey response or social media comment). By using NLP, topic analysis identifies semantic and syntactical patterns and generates meaningful, contextual insights.


A machine learning technique. As the name suggests, clustering automatically groups individual datapoints into “clusters” of relation. It can be used to identify patterns and to establish relationships within a dataset.


The process by which machines recognize datapoints and sort them into “classes” (or categories).

Language models

A language model analyzes huge amounts of data to determine the probability of a word or sequence of words occurring in a sentence. Essentially, neural networks use language models to refine their linguistic predictions and to optimize the accuracy of results.


A deep learning model. Transformers learn linguistic context and meaning by tracking relationships in sequential data. Using a complex set of mathematical techniques, they can pinpoint even distant relationships between datapoints, using that knowledge to generate accurate results from datasets at speed.


Context, or contextual awareness, is fundamental to the accurate output of any AI. Within machine learning and text analytics, context is what allows a machine to genuinely understand the text, speech, or imagery within the dataset it’s reading — and to then process that new understanding and generate accurate results that make sense for the question being asked of the data.

Want to learn even more about these terms in a practical context that can transform your business? Tap the button to book a personalized demo of Hey Yabble.

Book a demo