AI in processing trade documents – Record Linkage

AI in processing trade documents – Record Linkage

The Census and Statistics Department (C&SD) is authorised by the Customs and Excise Department to process Import/Export Declarations (TDEC) and Cargo Manifests for compiling trade statistics. Over the years, C&SD has relied on the Electronic System for Cargo Manifests (EMAN), a computerised platform that matches TDECs and Cargo Manifests using rule-based algorithms. However, EMAN’s matching capability is limited to straightforward document pairs, requiring clerical staff to manually process unmatched Cargo Manifests by comparing unstructured text fields such as goods descriptions between documents.

Since 2021, C&SD has developed Artificial Intelligence (AI) models to handle Cargo Manifests that EMAN cannot match across different transport modes. The AI models for air and water transport were successfully implemented in 2024, followed by the road transport model in the second quarter of 2025. These models represent a significant advancement in C&SD’s trade document processing capabilities.

These AI models employ Natural Language Processing (NLP) techniques to simulate human decision-making, analysing unstructured text data that EMAN cannot process. It calculates similarity scores for potential matches and only accepts pairs that both exceed a predefined threshold and achieve the highest score, ensuring matching accuracy.

After implementing these AI models in the second quarter of 2025, cases requiring manual processing were reduced by approximately 40%. The resulting saved resources have been reallocated to further drive the development of data science and the statistical areas involving big data, with a view to enhancing the quality of statistical services provided to the public. Furthermore, this initiative has fostered valuable data science expertise among statistical professionals, positioning C&SD for continued innovation in statistical operations.