Work individually or in pairs. Discuss freely with the people around you. Please ask me and Professor Her for explanations when needed: keep us busy!
Make a list of Chinese words that you are interested in. These could be words that have a different meaning in Taiwan and China (like同志 or 機車), or words which are only used in one area (such as錄影機 or录像机). Remember that in Linguistics we are usually more interested in the distribution and use of Chinese words (詞) than of hanzi.
Look the words up in a China corpus (Leeds and Lancaster corpora are both at http://corpus.leeds.ac.uk/query-zh.html)
Now look the words up in a Taiwan corpus (Academia Sinica, at http://dbo.sinica.edu.tw/ftms-bin/kiwi1/mkiwi.sh)
Note what difference you find in the usage of your chosen words
Explore the Academia Sinica corpus in more detail.
Read the 簡介.
Choose different categories of texts (not only 全部). Note interesting results that you find.
Experiment with different POS (詞類).
Check the 特徵; try to figure out what “spo” and “spv” mean.
Try choosing 重疊詞, instead of 關鍵詞. Explain or discuss.
Try the顯示複雜詞類特徵 , 進階處理 and 自訂語料庫 options, when looking at the concordance output.
Take a closer look at the two China corpora.
Make concordances again, and click the number or ≫ on the left. What happens?
From there (on the Internet corpus, not the Lancaster corpus) click the www… link. Where does it take you?
Go back to the query window. Choose Collocations instead of Concordances (this doesn’t work on the Lancaster corpus, only the Internet corpus). Experiment with different settings for Collocations, and note your findings.
Login to the Sketch Engine at www.sketchengine.co.uk. Use the user name nccu57 and password twman.
Browse the different corpora, which are in many languages, including Chinese.
Select the Traditional Chinese Gigaword corpus.
Open up Text types (click the + symbol).
Select either the Xinhua or CNA subcorpus
Make concordances in each subcorpus for the words you chose in (1), and note the differences. Remember that CNA is nearly twice as big as Xinhua
Still in Sketch Engine, think of a phrase which might be more common in Taiwan, or in China.
Open up Keywords (click the + symbol).
Type in your phrase
Compare the results between the CAN and Xinhua subcorpora.
Complex queries:
To get everything beginning with 國, type “國.*" in the CQL box.
Figure out how to get everything that ends with國.
Look at the Tagset summary. Notice the codes for “verb”, “noun”, “foreign word” etc.
To get a list of all foreign words in the corpus, type [tag="FW"] in the CQL box.
Click view options. Under attributes, check tag, then Change view options.
Look at the revised concordance, and notice the POS tag.
Now, figure out how to get a concordance of國.*followed by a verb.
Experiment with the Word Sketch part of Sketch Engine. Word sketches give a summary of the keyword (關鍵詞) and its collocations (搭配詞).
Again, you can specify the subcorpus, choosing Taiwan or mainland.
Can you understand the function of Word Sketches
Experiment with the Sketch Engine thesaurus.
Study the results carefully.
What difference can you find, compared to a traditional thesaurus (e.g. http://www.chinese-tools.com/tools/synonyms.html)
Here is an interesting site, http://humanum.arts.cuhk.edu.hk/Lexis/chifreq/, which compares the frequency of characters (not words) in China and Taiwan (and Hong Kong).
Monday, November 16, 2009
Subscribe to:
Comments (Atom)