document-understanding
2023.4
false
- Overview
- Document Understanding Process
- Quickstart tutorials
- Framework components
- ML packages
- Overview
- Document Understanding - ML package
- DocumentClassifier - ML package
- ML packages with OCR capabilities
- 1040 - ML package
- 4506T - ML package
- 990 - ML Package - Preview
- ACORD125 - ML package
- ACORD126 - ML package
- ACORD131 - ML package
- ACORD140 - ML package
- ACORD25 - ML package
- Bank Statements - ML package
- Bills Of Lading - ML package
- Certificate of Incorporation - ML package
- Certificate of Origin - ML package
- Checks - ML package
- Children Product Certificate - ML package
- CMS 1500 - ML package
- EU Declaration of Conformity - ML package
- Financial Statements - ML package
- FM1003 - ML package
- I9 - ML package
- ID Cards - ML package
- Invoices - ML package
- Invoices Australia - ML package
- Invoices China - ML package
- Invoices India - ML package
- Invoices Japan - ML package
- Invoices Shipping - ML package
- Packing Lists - ML package
- Passports - ML package
- Payslips - ML package
- Purchase Orders - ML package
- Receipts - ML Package
- Remittance Advices - ML package
- Utility Bills - ML package
- Vehicle Titles - ML package
- W2 - ML package
- W9 - ML package
- Other Out-of-the-box ML Packages
- Public Endpoints
- Hardware requirements
- Pipelines
- Document Manager
- OCR services
- Deep Learning
- Document Understanding deployed in Automation Suite
- Install and use
- First run experience
- Deploy UiPathDocumentOCR
- Deploy an out-of-the-box ML package
- Offline bundles 2023.4.11
- Offline bundles 2023.4.10+patch1
- Offline bundles 2023.4.10
- Offline bundles 2023.4.9
- Offline bundles 2023.4.8
- Offline bundles 2023.4.7
- Offline bundles 2023.4.6
- Offline Bundles 2023.4.5
- Oflline bundles 2023.4.4
- Offline Bundles 2023.4.3
- Offline Bundles 2023.4.2
- Offline Bundles 2023.4.1
- Offline Bundles 2023.4.0
- Use Document Manager
- Use the Framework
- Document Understanding deployed in AI Center standalone
- Licensing
- Activities
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentProcessing.Contracts
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities

Document Understanding User Guide
Last updated Feb 13, 2025
OCR
Each OCR engine is
tailored to deliver efficient and effective optical character recognition, regardless of
your specific needs or deployment. This page provides information on the supported
languages for UiPath® OCR engines:
- UiPath Document OCR: default UiPath OCR, which receives regular updates and improvements. You can use it on either GPU or CPU, delivering the same level of accuracy in both cases.
- UiPath Document OCR_CPU: specially optimized to run on CPU.
- UiPath Extended Languages OCR: capable of processing documents in over 200 languages, especially in Chinese, Korean, Vietnamese, Thai, major Indian languages, and languages that use the Cyrilic or Greek alphabets.
Tip: Choosing the right
OCR engine for your documents is simple. By default, use the UiPath Document OCR,
which benefits from regular updates and improvements. If this doesn't support your
document language or it's not performing as expected, switch to one of our other OCR
engines, like the UiPath Extended Languages OCR.
Language (Language Code) | UiPath Document OCR and UiPath Document OCR_CPU | UiPath Extended Languages OCR | Chinese, Japanese, Korean OCR |
---|---|---|---|
Adyghe (ADY) | |||
Afar (AA) | |||
Afrikaans (AFR) | |||
Akan (AK) | |||
Albanian (SQI) | |||
Algonquin (ALQ) | |||
Angika (Devanagari) (ANP) | |||
Arabic (ARA) | |||
Asturian (AST) | |||
Asu (ASA) | |||
Avaric (AV) | |||
Awadhi-Hindi (Devanagari) (AWA) | |||
Aymara (AYM) | |||
Azerbaijani (Latin) (AZ) | |||
Bafia (KSF) | |||
Bagheli (BFY) | |||
Bambara (BM) | |||
Bashkir (BA) | |||
Basque (EU) | |||
Belarusian (Cyrilic) (BE, BE-CYRL) | |||
Belarusian (Latin) (BE, BE-LATN) | |||
Bemba (BEM) | |||
Bena (BEZ) | |||
Bhojpuri-Hindi (Devanagari) (BHO) | |||
Bikol (BIK) | |||
Bislama (BI) | |||
Bodo (Devanagari) (BRX) | |||
Bosnian (Latin) (BS) | |||
Brajbha (BRA) | |||
Breton (BR) | |||
Bulgarian (BG) | |||
Bundeli (BNS) | |||
Buryat (Cyrilic) (BUA) | |||
Catalan (CA) | |||
Cebuano (CEB) | |||
Chamling (RAB) | |||
Chamorro (CH) | |||
Chechen (CE) | |||
Chhattisgarhi (Devanagari) (HNE) | |||
Chiga (CGG) | |||
Chinese - Simplified (ZH-Hans) | |||
Chinese - Traditional (Hant) | |||
Choctaw (CHO) | |||
Chukot (CKT) | |||
Chuvash (CV) | |||
Cornish (KW) | |||
Corsican (CO) | |||
Cree (CR) | |||
Creek (MUS) | |||
Crimean Tatar (Latin) (CRH) | |||
Croatian (HR) | |||
Crow (CRO) | |||
Czech (CS) | |||
Danish (DA) | |||
Dargwa (DAR) | |||
Dari (PRS) | |||
Dhimal (Devanagari) (DHI) | |||
Dogri (Devanagari) (DOI) | |||
Duala (DUA) | |||
Dungan (DNG) | |||
Dutch (NL) | |||
Efik (EFI) | |||
English (EN) | |||
Erzya (Cyrilic) (MYV) | |||
Estonian (ET) | |||
Faroese (FO) | |||
Fijian (FJ) | |||
Filipino (FIL) | |||
Finnish (FI) | |||
Fon (FON) | |||
French (FR) | |||
Friulian (FUR) | |||
Ga (GAA) | |||
Gaelic - Irish (GA) | |||
Gaelic - Scottish (GD) | |||
Gagauz (Latin) (GAG) | |||
Galician (GL) | |||
Ganda (LG) | |||
Gayo (GAY) | |||
German (DE) | |||
Gilbertese (GIL) | |||
Gondi (Devanagari) (GON) | |||
Greek (EL) | |||
Greenlandic (KL) | |||
Guarani (GN) | |||
Gurung (Devanagari) | |||
Gusii (GUZ) | |||
Haitian Creole (HT) | |||
Halbi (Devanagari) (HLB) | |||
Hani (HNI) | |||
Haryanvi (BGC) | |||
Hawaiian (HAW) | |||
Hebrew (HE) | |||
Herero (HZ) | |||
Hiligaynon (HIL) | |||
Hindi (HI) | |||
Hmong Daw (Latin) (MWW) | |||
Ho (Devanagari) (HOC) | |||
Hungarian (HU) | |||
Iban (IBA) | |||
Icelandic (IS) | |||
Igbo (IG) | |||
Iloko (ILO) | |||
Inari Sami (SMN) | |||
Indonesian (ID) | |||
Ingush (INH) | |||
Interlingua (IA) | |||
Inuktitut (Latin) (IU) | |||
Italian (IT) | |||
Japanese (JA) | |||
Jaunsari (Devanagari) (JNS) | |||
Javanese (JV) | |||
Jola-Fonyi (DYO) | |||
Kabardian (KBD) | |||
Kabuverdianu (KEA) | |||
Kachin (Latin) (KAC) | |||
Kalenjin (KLN) | |||
Kalmyk (XAL) | |||
Kangri (Devanagari) (XNR) | |||
Kanuri (KR) | |||
Karachay-Balkar (KRC) | |||
Kara-Kalpak (Cyrilic) (KAA-CYR) | |||
Kara-Kalpak (Latin) (KAA) | |||
Kashubian (CSB) | |||
Kazakh (Cyrilic) (KK-CYR) | |||
Kazakh (Latin) (KK-LATN) | |||
Khakas (KJH) | |||
Khaling (KLR) | |||
Khasi (KHA) | |||
K'iche' (QUC) | |||
Kikuyu (KI) | |||
Kildin Sami (SJD) | |||
Kinyarwanda (RW) | |||
Komi (KV) | |||
Kongo (KN) | |||
Korean (KO) | |||
Korku (KFQ) | |||
Koryak (KPY) | |||
Kosraean (KOS) | |||
Kpelle (KPE) | |||
Kuanyama (KJ) | |||
Kumyk (Cyrilic) (KUM) | |||
Kurdish (Arabic) (KU-ARAB) | |||
Kurdish (Latin) (KU-LATN) | |||
Kurukh (Devanagari) (KRU) | |||
Kyrgyz (Cyrilic) (KY) | |||
Lak (LBE) | |||
Lakota (LKT) | |||
Latin (LA) | |||
Latvian (LV) | |||
Lezghian (LEX) | |||
Lingala (LN) | |||
Lithuanian (LT) | |||
Lower Sorbian (DSB) | |||
Lozi (LOZ) | |||
Lule Sami (SMJ) | |||
Luo (Kenya and Tanzania) (LUO) | |||
Luxembourgish (LB) | |||
Luyia (LUY) | |||
Macedonian (MK) | |||
Machame (JMC) | |||
Madurese (MAD) | |||
Mahasu Pahari (Devanagari) (BFZ) | |||
Makhuwa-Meetto (MGH) | |||
Makonde (KDE) | |||
Malagasy (MG) | |||
Malay (Latin) (MS) | |||
Maltese (MT) | |||
Malto (Devanagari) (KMJ) | |||
Mandinka (MNK) | |||
Manx (GV) | |||
Maori (MI) | |||
Mapundungun (ARN) | |||
Marathi (MR) | |||
Mari (Russia) (CHM) | |||
Masai (MAS) | |||
Mende (Sierra Leone) (MEN) | |||
Meru (MER) | |||
Meta' (MGO) | |||
Minangkabau (MIN) | |||
Mohawk (MOH) | |||
Mongolian (Cyrilic) (MN) | |||
Mongondow (MOG) | |||
Montenegrin (Cyrilic) (CNR-CYRL) | |||
Montenegrin (Latin) (CNR-LATN) | |||
Morisyen (MFE) | |||
Mundang (MUA) | |||
Nahuatl (NAH) | |||
Navajo (NV) | |||
Ndonga (NG) | |||
Neapolitan (NAP) | |||
Nepali (NE) | |||
Ngomba (JGO) | |||
Niuean (NIU) | |||
Nogay (NOG) | |||
North Ndebele (ND) | |||
Northern Sami (Latin) (SME) | |||
Norwegian (NO) | |||
Nyanja (NY) | |||
Nyankole (NYN) | |||
Nzima (NZI) | |||
Occitan (OC) | |||
Ojibway (OJ) | |||
Oromo (OM) | |||
Ossetic (OS) | |||
Pampanga (PAM) | |||
Pangasinan (PAG) | |||
Papiamento (PAP) | |||
Pashto (PS) | |||
Pedi (NSO) | |||
Persian (FA) | |||
Polish (PL) | |||
Portuguese (PT) | |||
Punjabi (Arabic) (PA) | |||
Quechua (QU) | |||
Ripurian (KSH) | |||
Romanian (RO) | |||
Romansh (RM) | |||
Rundi (RN) | |||
Russian (RU) | |||
Rwa (RWK) | |||
Sadri (Devanagari) (SCK) | |||
Sakha (SAH) | |||
Samburu (SAQ) | |||
Samoan (Latin) (SM) | |||
Sango (SG) | |||
Sangu (Gabon) | |||
Sanskrit (Devanagari) (SA) | |||
Santali (Devanagari) (SAT) | |||
Scots (SCO) | |||
Sena (SEH) | |||
Serbian (Cyrilic) (SR-CYRL) | |||
Serbian (Latin) (SR, SR-LATN)) | |||
Shambala (KSB) | |||
Shona (SN) | |||
Siksika (BLA) | |||
Sirmauri (Devanagari) (SRX) | |||
Skolt Sami (SMS) | |||
Slovak (SK) | |||
Slovenian (SL) | |||
Soga (XOG) | |||
Somali (Arabic) (SO) | |||
Somali (Latin) (SO-LATN) | |||
Songhai (SON) | |||
South Ndebele (NR) | |||
Southern Altai (ALT) | |||
Southern Sami (SMA) | |||
Southern Sotho (ST) | |||
Spanish (ES) | |||
Sundanese (SU) | |||
Swahili (Latin) (SW) | |||
Swati (SS) | |||
Swedish (SV) | |||
Tabassaran (TAB) | |||
Tachelhit (SHI) | |||
Tahitian (TY) | |||
Taita (DAV) | |||
Tajik (Cyrilic) (TG) | |||
Tamil (TA) | |||
Tatar (Cyrilic) (TT-CYRL) | |||
Tatar (Latin) (TT) | |||
Teso (TEO) | |||
Tetum (TET) | |||
Thai (TH) | |||
Thangmi (THF) | |||
Tok Pisin (TPI) | |||
Tongan (TO) | |||
Tsonga (TS) | |||
Tswana (TN) | |||
Turkish (TR) | |||
Turkmen (Latin) (TK) | |||
Tuvan (TYV) | |||
Udmurt (UDM) | |||
Uighur (Cyrilic) (UG-CYRL) | |||
Ukranian (UK) | |||
Upper Sorbian (HSB) | |||
Urdu (UR) | |||
Uyghur (Arabic) (UG) | |||
Uzbek (Arabic) (UZ-ARAB) | |||
Uzbek (Cyrilic) (UZ-CYRL) | |||
Uzbek (Latin) (UZ) | |||
Vietnamese (VI) | |||
Volapuk (VO) | |||
Vunjo (VUN) | |||
Walser (WAE) | |||
Welsh (CY) | |||
Western Frisian (FY) | |||
Wolof (WO) | |||
Xhosa (XH) | |||
Yucatec Maya (YUA) | |||
Zapotec (ZAP) | |||
Zarma (DJE) | |||
Zhuang (ZA) | |||
Zulu (ZU) |
Supported OCR characters | ! " # $ % & \ ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ \ ] ^ _ a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ £ ¥ § © ® ° ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý ß à á â ã ä å æ ç è é ê ë ì í î ï ñ ò ó ô õ ö ø ù ú û ü ý Ā ā Ă ă Ą ą Ć ć Ċ ċ Č č Ď ď Đ đ Ē ē Ė ė Ę ę Ě ě Ğ ğ Ġ ġ Ħ ħ Ī ī Ĭ ĭ Į į İ ı Ĺ ĺ Ľ ľ Ł ł Ń ń Ň ň Ŋ ŋ Ō ō Ő ő Œ œ Ŕ ŕ Ř ř Ś ś Š š Ť ť Ŧ ŧ Ū ū Ŭ ŭ Ů ů Ų ų Ź ź Ż ż Ž ž Ə Ǵ ǵ Ș ș Ț ț ə μ א ב ג ד ה ו ז ח ט י ך כ ל ם מ ן נ ס ע ף פ ץ צ ק ר ש ת ₪ € ≤ ≥ |