Key Concepts

Review core concepts you need to learn to master this subject

Natural Language Processing

Natural language processing (NLP) is concerned with enabling computers to interpret, analyze, and approximate the generation of human speech. Typically, this would refer to tasks such as generating responses to questions, translating languages, identifying languages, summarizing documents, understanding the sentiment of text, spell checking, speech recognition, and many other tasks. The field is at the intersection of linguistics, AI, and computer science.

Natural Language Toolkit

Natural Language Toolkit (NLTK) is a Python library used for building Python programs that work with human language data for applying in statistical natural language processing (NLP).

NLTK contains text processing libraries for tokenization, parsing, classification, stemming, tagging and semantic reasoning. It also includes graphical demonstrations and sample data sets for NLP.

Language Models

Language models are probabilistic machine models of language used for NLP comprehension tasks. They learn a probability of word occurrence over a sequence of words and use it to estimate the relative likelihood of different phrases. This is useful in many applications, such as speech recognition, optical character recognition, handwriting recognition, machine translation, spelling correction, and many other applications.

Common language models include:

  • Statistical models
    • Bag of words (unigram model)
      • applications include term frequency, topic modeling, and word clouds
    • _n_-gram models
  • Neural Language Modeling (NLM).

Text Similarity in NLP

Text similarity is a facet of NLP concerned with the similarity between texts. Two popular text similarity metrics are Levenshtein distance and cosine similarity.

Levenshtein distance, also called edit distance, is defined as the minimum number of edit operations (deletions, insertions, or substitutions) required to transform a text into another.

Cosine similarity measures the cosine of the angle between two vectors. To determine cosine similarity, text documents need to be converted into vectors.

Language Prediction in NLP

Language prediction is an application of NLP concerned with predicting language given preceding language.

Auto-suggest and suggested replies are common forms of language prediction. Common approaches inlcude:

  • _n_-grams using Markov chains,
  • Long Short Term Memory (LSTM) using a neural network.
Getting Started with Natural Language Processing
Lesson 1 of 1
  1. 1
    Look at the technologies around us: - Spellcheck and autocorrect - Auto-generated video captions - Virtual assistants like Amazon’s Alexa - Autocomplete - Your news site’s suggested articles What …
  2. 2
    “You never know what you have… until you clean your data.” ~ Unknown (or possibly made up) Cleaning and preparation are crucial for many tasks, and NLP is no exception. **_Text preprocessing…
  3. 3
    You now have a preprocessed, clean list of words. Now what? It may be helpful to know how the words relate to each other and the underlying syntax (grammar). Parsing is a stage of NLP concern…
  4. 4
    How can we help a machine make sense of a bunch of word tokens? We can help computers make predictions about language by training a language model on a corpus (a bunch of example text). **_Langu…
  5. 5
    For parsing entire phrases or conducting language prediction, you will want to use a model that pays attention to each word’s neighbors. Unlike bag-of-words, the n-gram model considers a sequ…
  6. 6
    We’ve touched on the idea of finding topics within a body of language. But what if the text is long and the topics aren’t obvious? Topic modeling is an area of NLP dedicated to uncovering l…
  7. 7
    Most of us have a good autocorrect story. Our phone’s messenger quietly swaps one letter for another as we type and suddenly the meaning of our message has changed (to our horror or pleasure). Howe…
  8. 8
    How does your favorite search engine complete your search queries? How does your phone’s keyboard know what you want to type next? Language prediction is an application of NLP concerned with …
  9. 9
    Believe it or not, you’ve just scratched the surface of natural language processing. There are a slew of advanced topics and applications of NLP, many of which rely on deep learning and neural netw…
  10. 10
    As you’ve seen, there are a vast array of applications for NLP. However, as they say, “with great language processing comes great responsibility” (or something along those lines). When working wit…
  11. 11
    Check out how much you’ve learned about natural language processing! - Natural language processing combines computer science, linguistics, and artificial intelligence to enable computers to process…

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo