This article was written in part using ChatGPT, then reviewed and improved by a human. 🤖 🤝 🤓
To use generative AI consciously and ethically, it is still necessary to understand how it works. And AI is not that:
‍

‍
But more like that.
‍

‍
BLMs like ChatGPT are AI models that use learning algorithms to generate content such as text, images, music, or videos. This learning often requires a very large amount of data and documents for the model to be effective.
**BLMs don't analyze words directly, they deal with tokens. A token is a basic unit used to represent and process language in generative AIs. It can be a letter, syllable, word, or part of a word.
Tokens are the interface between human and machine language, allowing LLMs to understand, analyze, and generate text in a consistent and relevant manner. For example, the word “intelligence” is composed of 2 tokens: 'int' and 'elligence' (click here to learn more about Tokens or to See concretely how a text is broken down into a token).
Concretely the division of texts into tokens will allow the model to understand which word combinations are frequent, how words come together to form sentences and how the meaning changes with the context thanks to the Transformer algorithm, which is the basis of the success of ChatGPT.
To understand a request (a prompt) the model will first divide it into tokens to understand it and then generate a text response by predicting each next word using the patterns learned during training.
It is important to note that the answers generated by BLMs are not based solely on statistics. They are also influenced by rules and alignments defined by humans.
‍
‍

‍
Although powerful, BLMs are far from perfect.
The data you provide to a BLM can be used to train future algorithms or be accessible by the teams that manage them. Their exact use is often unclear.
BLMs are not designed to provide accurate and sourced information. They produce text based on statistical models, which means that the veracity of the information is not guaranteed. Concretely, Chat GPT can give a false answer with confidence rather than abstain (what we call hallucinations), unless it detects that the answer is outside its learning window. Models can also reflect biases in training data.
There are solutions to avoid these pitfalls as much as possible, such as asking Chat GPT to source its answers. It is possible to do a “prompt follow-up” where you ask ChatGPT to analyze your response and say if everything is true, two to three times in a row if the answer seems really surprising.
Building a good prompt (specific and clear) to guide the model remains essential for verifying the data: that's good news, we made an article about it;).