data:image/s3,"s3://crabby-images/826d1/826d1f4037d8b473a880074ca38a36ddd394f862" alt="Artificial Vision and Language Processing for Robotics"
Introduction
Natural Language Processing (NLP) is an area of Artificial Intelligence (AI) with the goal of enabling computers to understand and manipulate human language in order to perform useful tasks. Within this area, there are two sections: Natural Language Understanding (NLU) and Natural Language Generation (NLG).
In recent years, AI has changed the way machines interact with humans. AI helps people solve complex equations by performing tasks such as recommending a movie according to your tastes (recommender systems). Thanks to the high performance of GPUs and the huge amount of data available, it's possible to create intelligent systems that are capable of learning and behaving like humans.
There are many libraries that aim to help with the creation of these systems. In this chapter, we will review the most famous Python libraries to extract and clean information from raw text. You may consider this task complex, but a complete understanding and interpretation of the language is a difficult task in itself. For example, the sentence "Cristiano Ronaldo scores three goals" would be hard for a machine to understand because it would not know who Cristiano Ronaldo is or what is meant by the number of goals.
One of the most popular topics in NLP is Question Answering (QA). This discipline also consists of Information Retrieval (IR). These systems construct answers by querying a database for knowledge or information, but they are capable of extracting answers from a collection of natural language documents. That is how a search engine such as Google works.
In the industry today, NLP is becoming more and more popular. The latest NLP trends are online advertisement matching, sentiment analysis, automated translation, and chatbots.
Conversational agents, popularly known as chatbots, are the next challenge for NLP. They can hold real conversation and many companies use them to get feedback about their products or to create a new advertising campaign, by analyzing the behavior and opinions of clients through the chatbot. Virtual assistants are a great example of NLP and they have already been introduced to the market. The most famous are Siri, Amazon's Alexa, and Google Home. In this book, we will create a chatbot to control a virtual robot that is able to understand what we want the robot to do.
Natural Language Processing
As mentioned before, NLP is an AI field that takes care of understanding and processing human language. NLP is located at the intersection between AI, computer science, and linguistics. The main aim of this area is to make computers understand statements or words written in human languages:
data:image/s3,"s3://crabby-images/861c3/861c3fa0f95c2f15c23793c98f5748fdbcbc0f32" alt=""
Figure 3.1: Representation of NLP within AI, linguistics, and computer science
Linguistic science focuses on the study of human language, trying to characterize and explain the different approaches of language.
A language can be defined as a set of rules and a set of symbols. Symbols are combined and used to broadcast information and are structured by rules. Human language is special. We cannot simply picture it as naturally formed symbols and rules; depending on the context, the meaning of words can change.
NLP is becoming more popular and can solve many difficult problems. The amount of text data available is very large, and it is impossible for a human to process all that data. In Wikipedia, the average number of new articles per day is 547, and in total, there are more than 5,000,000 articles. As you can imagine, a human cannot read all that information.
There are three challenges faced by NLP. The first challenge is collecting all the data, the second is classifying it, and the final one is extracting the relevant information.
NLP solves many tedious tasks, such as spam detection in emails, part-of-speech (POS) tagging, and named entity recognition. With deep learning, NLP can also solve voice-to-text problems. Although NLP shows a lot of power, there are some cases such as working without having a good solution from the dialog between a human and a machine, QA systems summarization and machine translation.
Parts of NLP
As mentioned before, NLP can be divided into two groups: NLU and NLG.
Natural Language Understanding
This section of NLP relates to the understanding and analysis of human language. It focusses on the comprehension of text data, and processing it to extract relevant information. NLU provides direct human-computer interaction and performs tasks related to the comprehension of language.
NLU covers the hardest of AI challenges, and that is the interpretation of text. The main challenge of NLU is understanding dialog.
Note
NLP uses a set of methods for generating, processing, and understanding language. NLU uses functions to understand the meaning of a text.
Previously, a conversation was represented as a tree, but this approach cannot cover many dialog cases. To cover more cases, more trees would be required, one for each context of the conversation, leading to the repeating of many sentences:
data:image/s3,"s3://crabby-images/70974/7097401897c3f3d5f941c5d0a60dc63883081642" alt=""
Figure 3.2: Representation of a dialogue using trees
This approach is outdated and inefficient because is based on fixed rules; it's essentially an if-else structure. But now, NLU has contributed another approach. A conversation can be represented as a Venn diagram where each set is a context of the conversation:
data:image/s3,"s3://crabby-images/47879/478798eec852906c39c506cd9b7d934563e3638a" alt=""
Figure 3.3: Representation of a conversation using a Venn diagram
As you can see in the previous figures, the NLU approach improves the structure of understanding a conversation, because it is not a fixed structure that contains if-else conditions. The main goal of NLU is to interpret the meaning of human language and deal with the contexts of a conversation, solving ambiguities and managing data.
Natural Language Generation
NLG is the process of producing phrases, sentences, and paragraphs with meaning and structure. It is an area of NLP that does not deal with understanding text.
To generate natural language, NLG methods need relevant data.
NLG has three components:
- Generator: Responsible for including the text within an intent to have it related with the context of the situation
- Components and levels of representations: Gives structure to the generated text
- Application: Saves relevant data from the conversation to follow a logical thread
Generated text must be in a human-readable format. The advantages of NLG are that you can make your data accessible and you can create summaries of reports rapidly.
Levels of NLP
Human language has different levels of representation. Each representation level is more complex than the previous level. As we ascend through the levels, it gets more difficult to understand the language.
The two first levels depend on the data type (audio or text), in which we have the following:
- Phonological analysis: If the data is speech, first, we need to analyze the audio to have sentences.
- OCR/tokenization: If we have text, we need to recognize the characters and form words using computer vision (OCR). If not, we will need to tokenize the text (that is, split the sentence into units of text).
Note
The OCR process is the identification of characters in an image. Once it generates words, they are processed as raw text.
- Morphological analysis: Focused on the words of a sentence and analyzing its morphemes.
- Syntactic analysis: This level focuses on the grammatical structure of a sentence. That means understanding different parts of a sentence, such as the subject or the predicate.
- Semantic representation: A program does not understand a single word; it can know the meaning of a word by knowing how the word is used in a sentence. For example, "cat" and "dog" could mean the same for an algorithm because they can be used in the same way. Understanding sentences in this way is called word-level meaning.
- Discourse processing: Analyzing and identifying connected sentences in a text and their relationships. By doing this, an algorithm could understand what the topic of the text is.
NLP shows great potential in today's industry, but there are some exceptions. Using deep learning concepts, we can work with some of these exceptions to get better results. Some of these problems will be reviewed in Chapter 4, Neural Networks with NLP. The advantage of text processing techniques and the improvement of recurrent neural networks are the reasons why NLP is becoming increasingly important.