Web-based Lexical Complexity Analyzer

The Lexical Complexity Analyzer (LCA), developed by Professor Xiaofei Lu at The Pennsylvania State University, is a tool that allows language teachers and researchers to analyze the lexical complexity of written English language samples, using 25 different measures of lexical density, variation and sophistication proposed in the first and second language development literature. The software runs on UNIX-like (LINUX, MAC OS, or UNIX) systems, and require the input texts to be part-of-speech (POS) tagged and lemmatized. This likely calls for familiarity of the command-line interface as well as some programming skills (e.g., part-of-speech tagging and lemmatization). The web-based interface to LCA, available on this website, eliminates the need for the command line interface and streamlines the above-mentioned natural language processing (NLP) processes, and generate the results in just a few clicks away.

Web-based LCA: Single Mode

The single mode allows you to analyze a single text (or compare two texts) for selected lexical complexity measures. You may choose to see the results of any or all of the 25 indices, and the system will create a graphical representation to visualize the results. Additionally, you may enter another text in order to compare their lexical complexity.

Web-based LCA: Batch Mode

The batch mode allows you to analyze lexical complexity of written English samples up to 200 files at a time. The results will be a CSV file that can be subsequently imported into spreadsheets or statistical packages for further analysis. Note that the batch mode requires you to register an account before using it. The registration is free and take less than a minute.

By using the web-based software described above, you are acknowledging that you agree to be legally bound and to abide by the LCA Terms of Service. If you intend to publish a paper that used the web-based interface to the LCA software, please cite:

  • Ai, Haiyang and Lu, Xiaofei (2010). A web-based system for automatic measurement of lexical complexity. Paper presented at the 27th Annual Symposium of the Computer-Assisted Language Consortium (CALICO-10). Amherst, MA. June 8-12.
  • Lu, Xiaofei (2012). The Relationship of Lexical Richness to the Quality of ESL Learners' Oral Narratives. The Modern Language Journal, 96(2), 190-208.

If you have any questions, problems, or suggestions regarding the web-based interface to the LCA software, please feel free to contact me at: Haiyang Ai's email.