Named entity re­cog­ni­tion (NER) is a sub-dis­cip­line of com­pu­ta­tion­al lin­guist­ics that’s used to identify named entities (proper names) in a text and catalogue them based on certain para­met­ers. The technique plays a par­tic­u­larly important role in the field of machine learning.

What is named entity re­cog­ni­tion (NER)?

Named entity re­cog­ni­tion (NER for short) is a dis­cip­line of com­pu­ta­tion­al lin­guist­ics that iden­ti­fies proper names in texts and auto­mat­ic­ally assigns them to specific cat­egor­ies. The method is therefore also referred to as proper name re­cog­ni­tion. Proper names or named entities are in­di­vidu­al words or sequences of several words that describe a real-life entity. This can be, for example, a person, a company, an authority, an event, a place, a specific product or even a date.

The dis­cip­line is also used in the field of machine learning and ar­ti­fi­cial in­tel­li­gence and ori­gin­ates from the field of Natural Language Pro­cessing (NLP), in which natural language is cat­egor­ised and processed using al­gorithms, computers and fixed rules. Thanks to con­tinu­ous further de­vel­op­ment, named entity re­cog­ni­tion can now demon­strate con­vin­cing success rates in many languages and can barely be dis­tin­guished from iden­ti­fic­a­tion by a human being.

AI Tools at IONOS
Empower your digital journey with AI
  • Get online faster with AI tools
  • Fast-track growth with AI marketing
  • Save time, maximise results

How does named entity re­cog­ni­tion work?

There are various methods for named entity re­cog­ni­tion, which we’ll discuss in more detail later in this article. However, there are basically two important steps for each method that are par­tic­u­larly relevant to the success of the action.

Iden­ti­fic­a­tion of proper names

This first involves the actual iden­ti­fic­a­tion of one or more named entities. These are not just typical people’s names such as ‘Emily Williams’. Proper nouns such as ‘Lake Tahoe’, ‘Second World War’, ‘Porsche’, ‘Ad­iron­dack Mountains’, ‘Jurassic Park’ or ‘October 12, 1986’ are also con­sidered named entities and can therefore be captured by named entity re­cog­ni­tion. Once these proper nouns have been iden­ti­fied as such, their beginning and end are marked. This enables a system to recognise them within a natural text.

Cat­egor­isa­tion of named entities

After iden­ti­fic­a­tion, the marked proper names are assigned to defined cat­egor­ies. These include personal names, places, his­tor­ic­al events, companies, au­thor­it­ies, products, dates or certain media titles and works of art. It’s important that named entity re­cog­ni­tion re­cog­nises variants of an entity and that the pre­vi­ously es­tab­lished start and end points are correct.

What NER pro­ced­ures are there?

While the two steps in named entity re­cog­ni­tion must always be carried out, there are various pro­ced­ures and methods for achieving the desired results. We’ll show you the four most common and, therefore, most suc­cess­ful ap­proaches.

Analysis with dic­tion­ar­ies

In what’s probably the simplest method, the entities are compared with different dic­tion­ar­ies. As soon as there’s a match between a word or word sequence and a proper name in a dic­tion­ary, the entity is marked as a named entity and then assigned to the cor­res­pond­ing category.

Rule-based named entity re­cog­ni­tion

Defined rules can also be used as a basis for named entity re­cog­ni­tion. For this purpose, patterns are developed, which are compared with the existing texts. If there are matches, the entities are iden­ti­fied and cat­egor­ised. The rule-based method is par­tic­u­larly suitable for certain spe­cial­ist texts and not for general use.

Machine learning and AI

The best results are achieved with methods that use machine learning or AI as a basis. Data sets are used to train the cor­res­pond­ing systems. The re­cog­ni­tion of stat­ist­ic­al cor­rel­a­tions plays a par­tic­u­larly important role here. Once the training is complete, the AI can search through unknown texts, recognise proper names and assign them to a category. The rule here is: the more com­pre­hens­ive and balanced the training data, the better the sub­sequent results.

Hybrid of rule-based and AI-supported NER

A hybrid approach of rule-based and AI-supported named entity re­cog­ni­tion can also provide very good results. Simple proper names are iden­ti­fied by the rule catalogue and more complex entities can be found and cata­logued by ar­ti­fi­cial in­tel­li­gence.

What ap­plic­a­tions does NER have?

There are numerous actual or con­ceiv­able future areas of ap­plic­a­tion for named entity re­cog­ni­tion. Here are some of the most important:

  • Sentiment analysis: Named entity re­cog­ni­tion is already being used to evaluate customer feedback and trends. For example, the AI iden­ti­fies brand names, opinions on products or other reactions.
  • Business in­tel­li­gence: NER is used to convert un­struc­tured texts into struc­tured data. This can be used in the area of in­form­a­tion retrieval and helps with the analysis of financial documents.
  • Data an­nota­tion: Data an­nota­tion can be used to develop and train improved models for text trans­la­tion, clas­si­fic­a­tion and analysis. named entity re­cog­ni­tion plays an important role in this.
  • Digital as­sist­ance: Named entity re­cog­ni­tion is suitable for services such as chatbots or other digital as­sist­ants. It evaluates requests from users and can provide cus­tom­ised response options on that basis.
  • Keyword­ing: This method is used, for example, to filter people or places from different articles and then store them as meta in­form­a­tion.
  • Search engines: The method is used to evaluate and improve search al­gorithms. This enables search engines to provide even more relevant results.
  • Neural networks: NER is also used in the field of long short-term memory (LSTM) and in com­par­able tech­niques.

What are the problems with named entity re­cog­ni­tion?

Even though named entity re­cog­ni­tion is de­vel­op­ing rapidly and can already achieve im­press­ive results, there are still some chal­lenges with regard to the tech­no­logy. In par­tic­u­lar, the ad­apt­a­tion of trained models to spe­cial­ist texts does not always lead to the desired results. This is es­pe­cially true if the data for transfer learning is not suf­fi­cient or specific enough. Due to new entities, models often have to access in­suf­fi­cient amounts of data. Zero-Shot or Few-Shot ap­proaches, which can also work with a smaller volume of data, offer a possible solution.

Go to Main Menu