Skip to content

Yousa-Mirage/methods_evolution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Methods Evolution

A project for extracting research methods from academic papers, analyzing their evolution over time.

Environment Setup

This project uses uv for Python environment management:

# Install uv if not already installed
pip install uv
# clone the repository
git clone https://github.com/tianranchunzhen/methods-evolution.git
cd methods-evolution
# Create the same environment
uv sync

Project Dependencies

This project requires Python 3.10+ and uses the following key packages now:

  • httpx: HTTP client for API requests
  • loguru: Logging utility
  • marker-pdf: PDF to Markdown converter
  • polars: Fast dataframe library
  • pyahocorasick: Efficient pattern matching
  • spaCy: NLP toolkit (used en_core_web_lg-3.8.0 model)
  • toml: Configuration file parser
  • torch & transformers: Deep learning frameworks (specified in pyproject.toml)

Workflow

  1. Prepare method dictionary: Parse NCRM research method typology into structured TOML format
  2. Collect sample papers: Fetch papers from academic sources with metadata
  3. Process papers: Convert PDFs to Markdown format using Marker
  4. Extract methods: Match method terms in papers using Aho-Corasick algorithm

TODO:

  1. Try AutoPhrase and see the results
  2. Handle the synonyms
  3. Simple analysis of the results
  4. Try to extract from the title + abstract or the method section, compare the results

Project Structure

  • Data/: Paper's data, texts and method dictionaries
  • Docs/: Some reference documents
  • Scripts/: Processing and analysis scripts
  • Results/: Output of method extraction
  • Models/: Machine learning models (used en_core_web_lg-3.8.0 model)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages