moHANA

Morphological Hangul Analyzer

Seung-hyun Seo, In-Ho Kang, and Jae-Dong Kim


Version: 0.9
Date: 11.21.2007

Overview

 

moHANA is a morphological hangul analyzer that analyzes korean words. Parts of Speech are categorized along five general dimension values: word class, morphological, syntactical, semantic, and pragmatic. In other words, the Parts of Speech in moHANA contain all information required to analyze korean words.

 

Binaries

Our beta version is free to download and use. However, you can get the first dimensional value of Part of Speech only.

 

Installation

To install moHANA you need to download moHANA.tar.gz.

 

How to use

moHANA is called with the following parameters:

        -h         - this help

        -1         - use level 1 grammar (default)
        -2         - use level 1 and 2 grammars

        -n         - show all analyzed results
        -c         - convert hanja to hangul
        -i         - show the original inflected form of a word
                     (show the root form of a word is default)

 

Some Example Analyses

    ./moHANA
      
  • 학교는
  •   학교_{ncn} + 는_{j}
     
  • 학생및선생
  •   학생_{ncn} + 및_{ad} + 선생_{ncn}
     
  • 안적는
  •   안적는_{unk}
     
  • 엄마와아이
  •   엄마_{ncn} + 와아이_{unk}

     

    Analyze grammatically incorrect expressions

        ./moHANA -2
          
  • 학교는
  •   학교_{ncn} + 는_{j}
     
  • 학생및선생
  •   학생_{ncn} + 및_{ad} + 선생_{ncn}
     
  • 안적는
  •   안_{ad} + 적_{pv} + 는_{ef}
     
  • 엄마와아이
  •   엄마_{ncn} + 와_{j} + 아이_{ncn}

     

    Analyze common writing errors

        ./moHANA -2
          
  • 잘먹는
  •   잘_{ad} + 먹_{pv} + 는_{ef}
     
  • 안예뻐지는
  •   안_{ad} + 예뻐_{pa} + 어_{ef} + 지_{aux} + 는_{ef}
     
  • 못자르는
  •   못_{ad} + 자르_{pv} + 는_{ef}
     
  • 안미끄러운
  •   안_{ad} + 미끄럽_{pa} + 은_{ef}
     
     
  • 편지에대한
  •   편지_{ncp} + 에_{j} + 대하_{pv} + ㄴ_{ef}
     
  • 청주에가는방법
  •   청주_{ncn} + 에_{j} + 가_{pv} + 는_{ef} + 방법_{ncn}
     
  • 생존하기위한방법
  •   생존_{ncp} + 하_{vfix} + 기_{ef} + 위하_{pv} + ㄴ_{ef} + 방법_{ncn}
     
  • 태풍으로인한피해
  •   태풍_{ncn} + 으로_{j} + 인하_{pv} + ㄴ_{ef} + 피해_{ncp}
     
     
  • 먹지못하는
  •   먹_{pv} + 지_{ef} + 못하_{aux} + 는_{ef}
     
     
  • 책상과의자
  •   책상_{ncn} + 과_{j} + 의자_{ncn}
     
  • 이효리의남자친구
  •   이효리_{nq_per} + 의_{j} + 남자_{ncn} + 친구_{ncn}

     

    Analyze very long legal terms

        ./moHANA -2
          
  • 지가공시및토지등의평가에대한법률
  •   지가공시_{ncp} + 및_{ad} + 토지_{ncn} + 등_{nfix} + 의_{j} + 평가_{ncp} + 에_{j} + 대하_{pv} + ㄴ_{ef} + 법률_{ncn}
     
  • 성매매알선등행위의처벌에대한법률
  •   성매매_{ncn} + 알선_{ncp} + 등_{nfix} + 행위_{ncn} + 의_{j} + 처벌_{ncp} + 에_{j} + 대하_{pv} + ㄴ_{ef} + 법률_{ncn}

     

    Generic affix analysis

        ./moHANA
          
  • 서귀포시를
  •   서귀포_{nq_loc} + 시_{nfix} + 를_{j}
     
  • 서귀포역을
  •   서귀포_{nq_loc} + 역_{nfix} + 을_{j}
     
  • 서귀포점을
  •   서귀포_{nq_loc} + 점_{nfix} + 을_{j}
     
     
  • (주)마이크로소프트
  •   (주)_{pref} + 마이크로_{nq_gro} + 소프트_{ncp}
     
  • (주)포항제철
  •   (주)_{pref} + 포항_{nq_gro} + 제철_{ncp}
     
  • (주)컴퓨터
  •   (_{ascii} + 주_{nc_one} + )_{ascii} + 컴퓨터_{ncn}

     

    Questions and Bug Reports

    If you have questions, please contact us via shsuh at wordwords.co.kr

     

    Disclaimer

    This software is free only for non-commercial use. It must not be distributed without prior permission of WordWords Corp. Korea Patent (10-2007-0024439) applied for by WordWords Corp.


    eXTReMe Tracker