The High-Performance and Explainable Language Processing Platform

Golem.ai Core is the no-training required artificial intelligence solution for building high-performance, robust, frugal, and unbiased NLP projects.

Build all your use cases with Golem.ai Core

Build your project from A to Z with our versatile NLP platform.

Extract the content of your documents to save reading time and automate their processing.
OCR processing - Content analysis - Information extraction
Extract the content of your documents to save reading time and automate their processing.
OCR processing - Content analysis - Information extraction
Analysts can create network structures to extract knowledge from different sources.
Text extraction - Content analysis - Information linking
Analysts can create network structures to extract knowledge from different sources.
Text extraction - Content analysis - Information linking
Process incoming messages by analyzing the message and its attachments.
Message and attachments analysis - Categorization - Information extraction
Process incoming messages by analyzing the message and its attachments.
Message and attachments analysis - Categorization - Information extraction
Protégez les données de vos utilisateurs et protégez-les des mauvais contenus.

Take advantage of the power of our revolutionary NLU

Explainable, frugal, multilingual, and customizable.

We confirm the arrival of the cargo ship Louis Bleriot containing operational equipent for hospitals at the port of Le Havre from the port of 香港. A two-hour delay in the unloding operation is expected.

Tokenization
Selection and separation of words (tokens) to keep only the relevant elements. Tokenization is enriched by the configuration, which is used to pre-select the relevant tokens.
We confirm thearrival of the cargo ship Louis Bleriot containing operational equipent for hospitals at the port of Le Havre from the port of 香港. A two-hour delay in theunloding operation is expected
Tokenization
Selection and separation of words (tokens) to keep only the relevant elements. Tokenization is enriched by the configuration, which is used to pre-select the relevant tokens.
Tokenization
Selection and separation of words (tokens) to keep only the relevant elements. Tokenization is enriched by the configuration, which is used to pre-select the relevant tokens.
arrival cargo ship Louis Bleriot operational equipent hospitals Le Havre from 香港 two-hour delay operational unloding
Tokenization
Selection and separation of words (tokens) to keep only the relevant elements. Tokenization is enriched by the configuration, which is used to pre-select the relevant tokens.
Dict:Multi
Correction of terms according to the appropriate language and business usage.
arrival cargo ship Louis Bleriot equipment operational hospitals Le Havre from Hong Kong delay two hours operational unloading
Dict:Multi
Correction of terms according to the appropriate language and business usage.
Chunking
Grouping business terms to improve the understanding.
arrival cargo ship
Louis
Bleriot equipment
operational
hospitals Le Havre from
Hong Kong delay two
hours
operational
unloading
Chunking
Grouping business terms to improve the understanding.
Chunking
Grouping business terms to improve the understanding.
arrival .cargo ship
Louis
Bleriot .equipment
operational
hospitalsLe Havre . from
Hong Kong . delay .two
hours
operational
unloading
Chunking
Grouping business terms to improve the understanding.
Named Entity Recognition
Assigning an entity type to each term.
arrivalStatus. cargo shipTransport
LouisNickname
BleriotName. equipmentProduct
operationalCharacteristic
or action

hospitalsSector. Le HavrePlace. fromDescriptor
Hong KongPlace. delayStatus. twoQuantity
hoursTime
operationalCharacteristic
or action

. unloadingAction
Named Entity Recognition
Assigning an entity type to each term.
Entity Linking
Creating links between entities to resolve a textual entity into a unique identifier from a knowledge base.
arrivalStatus. ‘transport’ : ‘cargo ship’,
‘name’ : ‘Louis Bleriot’
Transport
. ‘product’ : ‘equipment’,
‘characteristic’ : ‘operational’,
‘sector’ : ‘hospitals’
Product
. Le HavrePlace of arrival. Hong KongDeparture place. delayStatus. ‘number’ : ‘2’,
‘time’ : ‘hours’
Time
unloadingAction
Entity Linking
Creating links between entities to resolve a textual entity into a unique identifier from a knowledge base.
Dependency Parsing
Completing the understanding of the text by adding each term to an ontology.
statusarrivalStatus. transport > ship > cargo ship > Louis BleriotIMO 9776432Transport. product > medical devicesmedical devicesProduct. France > portLe HavrePlace of arrival. China > portHong KongDeparture place. statusdelayStatus. ‘number’ : ‘2’,
‘time’ : ‘hours’
Time
action > action shippingunloadingAction
Dependency Parsing
Completing the understanding of the text by adding each term to an ontology.
Interaction
Linking different terms based on an ontology to form a unit of meaning.
Delivery tracking statusarrivalStatus transport > ship > cargo ship > Louis BleriotIMO 9776432Transport product > medical devicesmedical devicesProduct France > portLe HavrePlace of arrival China > portHong KongDeparture place
Delivery status statusdelayStatus ‘number’ : ‘2’, ‘time’ : ‘hours’Time. action > action shippingunloadingAction
Interaction
Linking different terms based on an ontology to form a unit of meaning.

Text Extraction from
images and documents

Easily transform your documents into usable texts using our Extractor technology.

Several OCRs and extraction libraries available via API.

				
					package main

import (
  "fmt"
  "strings"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://extractor.golem.ai/v3/analyse"
  method := "POST"

  payload := strings.NewReader(`{
    "file": "https://www.yourfile.pdf"
}`)

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, payload)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("Authorization", "Basic XXX")
  req.Header.Add("Content-Type", "application/json")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}
				
			
				
					<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => '"https://extractor.golem.ai/scan"',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 200,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'POST',
  CURLOPT_POSTFIELDS =>'{
    "file": "https://www.yourfile.pdf",
    "useCache": true,
    "parsers": {
        "document": {
            "extractImages": false,
            "ocr": {
                "name": "tesseract",
                "mode": "auto"
            },
            "PDF": {
                "extractImages": false,
                "ocr": {
                    "name": "ida",
                    "mode": "on"
                }
            }
        },
        "image": {
            "minimumHeight": 500,
            "minimumWidth": 500,
            "ocr": {
                "name": "ida",
                "mode": "off"
            },
            "png": {
                "minimumWidth": 100,
                "ocr": {
                    "name": "ida"
                }
            }
        },
        "spreadsheet": {
            "readVertically": false,
            "unmergeCells": false,
            "splitPerBlock": false,
            "splitPerBlockRowLimit": 10,
            "splitPerBlockColumnLimit": 10,
            "parseHiddenSheets": false
        },
        "email": {
            "extractAttachments": false,
            "ignoredAttachments": [
                "xlsb",
                "eml"
            ],
            "msg": {
                "extractAttachments": true
            }
        }
    }
}',
  CURLOPT_HTTPHEADER => array(
    'Authorization: Basic XXX',
    'Content-Type: application/json'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;
				
			
				
					import requests
import json

if __name__ == "__main__":
    URL: str = "https://extractor.golem.ai/scan"

    payload: dict = json.dumps(
        {
            "file": "https://www.yourfile.pdf",
            "parsers": {
                "document": {
                    "extractImages": False,
                    "ocr": {"name": "tesseract", "mode": "auto"},
                    "PDF": {
                        "extractImages": False,
                        "ocr": {"name": "ida", "mode": "on"},
                    },
                },
                "image": {
                    "minimumHeight": 500,
                    "minimumWidth": 500,
                    "ocr": {"name": "ida", "mode": "off"},
                    "png": {"minimumWidth": 100, "ocr": {"name": "ida"}},
                },
                "spreadsheet": {
                    "readVertically": False,
                    "unmergeCells": False,
                    "splitPerBlock": False,
                    "splitPerBlockRowLimit": 10,
                    "splitPerBlockColumnLimit": 10,
                    "parseHiddenSheets": False,
                },
                "email": {
                    "extractAttachments": False,
                    "ignoredAttachments": ["xlsb", "eml"],
                    "msg": {"extractAttachments": True},
                },
            },
        }
    )

    headers: dict = {"Authorization": f"Basic XXX", "Content-Type": "application/json"}

    response: requests.Response = requests.request(
				"POST", URL, headers=headers, data=payload
		)

    print(response.text)
				
			
				
					var settings = {
  "url": "https://extractor.golem.ai/v3/analyse",
  "method": "POST",
  "timeout": 0,
  "headers": {
    "Authorization": "Basic XXX",
    "Content-Type": "application/json"
  },
  "data": JSON.stringify({
    "file": "https://www.yourfile.pdf"
  }),
};

$.ajax(settings).done(function (response) {
  console.log(response);
});
				
			

Golem.ai protects and respects your data

Our artificial intelligence allows us to respect your data by design.

Security

Golem.ai follows the cryptography recommendations issued by ANSSI.

Privacy

Golem.ai's AI is hosted at Scaleway in France. You remain the exclusive user and owner of your data.

Compliance

An accessible and documented API, available connectors.

Join the Golem.ai community

Do you have an NLP project? Try our Core technology by signing up for the waiting list.