User Tools

Site Tools


slopeq_for_nkjp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
slopeq_for_nkjp [2016/04/18 12:58] pezikslopeq_for_nkjp [2017/02/03 00:17] (current) – [Slop factor with relaxed order] mmolenda
Line 5: Line 5:
 ===== Word form queries ===== ===== Word form queries =====
  
-This is the simplest type of queries. Just type in the word in the search box and click the “search” button. The result is presented in the form of a KWIC (Key Words In Context) list, with the number of occurrences of a given lexical item displayed above.+This is the simplest type of queries. Just type in the word in the search box and click the “search” button. The result is presented in the form of a KWIC (Key Words In Context) list, with the number of sentences matching the querydisplayed above.
  
-[[http://tinyurl.com/oer2y2o|maszt]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/maszt/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/0/medium/0|maszt]]
  
 {{:maszt.jpg|}} {{:maszt.jpg|}}
Line 13: Line 13:
 The same method might be used to find sequences of two or more items: The same method might be used to find sequences of two or more items:
  
-[[http://tinyurl.com/njjq4y2|na zdrowie]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/na%20zdrowie/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/0/medium/0|na zdrowie]]
  
 {{:na_zdrowie.jpg|}} {{:na_zdrowie.jpg|}}
Line 23: Line 23:
 The format of the query is as follows: open triangular bracket + “lemma=” + the base form of the word sought + close triangular bracket. For instance, <lemma=potwór> will fetch all forms of “potwór”, including potwór, potwora, potworów, potworem... etc. The format of the query is as follows: open triangular bracket + “lemma=” + the base form of the word sought + close triangular bracket. For instance, <lemma=potwór> will fetch all forms of “potwór”, including potwór, potwora, potworów, potworem... etc.
  
-[[http://tinyurl.com/pjfral9|<lemma=potwór>]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/potw%C3%B3r/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/0/medium/0|<lemma=potwór>]]
  
 {{:potwor.jpg|}} {{:potwor.jpg|}}
Line 34: Line 34:
 Base form queries can be combined with surface queries: <lemma=widzieć> problemy will fetch phrases such as: widzę problemy, widzimy problemy, widzieli problemy itp.  Base form queries can be combined with surface queries: <lemma=widzieć> problemy will fetch phrases such as: widzę problemy, widzimy problemy, widzieli problemy itp. 
  
-[[http://tinyurl.com/nvfwxaf|<lemma=widzieć> problemy]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/%3Clemma%3Dwidzie%C4%87%3E%20problemy/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/0/medium/0|<lemma=widzieć> problemy]]
  
 {{:widze_problemy.jpg|}} {{:widze_problemy.jpg|}}
Line 40: Line 40:
 It is possible to combine two or more base form queries, for instance: <lemma=jeździć> <lemma=samochód> . The KWIC list for this query includes items such as: jeździmy samochodami, jadę samochodem, jeżdżą samochodem etc. It is possible to combine two or more base form queries, for instance: <lemma=jeździć> <lemma=samochód> . The KWIC list for this query includes items such as: jeździmy samochodami, jadę samochodem, jeżdżą samochodem etc.
  
-[[http://tinyurl.com/ojl4r6p|<lemma=jeździć> <lemma=samochód>]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/%3Clemma%3Dje%C5%BAdzi%C4%87%3E%20%3Clemma%3Dsamoch%C3%B3d%3E/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/0/medium/0|<lemma=jeździć> <lemma=samochód>]]
  
 {{:jezdzic_samochodem.jpg|}} {{:jezdzic_samochodem.jpg|}}
Line 51: Line 51:
 The pipeline symbol “|” represents an alternative between two or more words, e.g. “kupować|sprzedawać|remontować samochód|samochody” will fetch all the examples of "kupować", "sprzedawać" or "remontować" with either plural or singular nominative form of the noun "samochód" from the corpus. The pipeline symbol “|” represents an alternative between two or more words, e.g. “kupować|sprzedawać|remontować samochód|samochody” will fetch all the examples of "kupować", "sprzedawać" or "remontować" with either plural or singular nominative form of the noun "samochód" from the corpus.
  
-[[http://tinyurl.com/p5ovkmv|kupować|sprzedawać|remontować samochód|samochody]]+[[https://tinyurl.com/z56ro6y|kupować|sprzedawać|remontować samochód|samochody]]
  
 {{:kupowac_sprzedawac_samochod.jpg|}} {{:kupowac_sprzedawac_samochod.jpg|}}
Line 59: Line 59:
 ==== Slop factor ==== ==== Slop factor ====
  
-Slop factor allows the users to decide on the maximum number of words that can appear between the elements of a multi-word query (also referred to as “intervening words”). For instance, slop factor set to “1” in the case of “wielki człowiek” will return “wielki człowiek”, as well as “wielki mały człowiek” or “wielki jest człowiek”, since one lexical item was allowed to appear between “wielki” and “człowiek”. In order to set the Slop factor, place your query in brackets and specify the value, preceded by the equality signe.g(wielki człowiek)=1.+Slop factor allows the users to decide on the maximum number of words that can appear between the elements of a multi-word query (also referred to as “intervening words”). For instance, slop factor set to “1” in the case of “wielki człowiek” will return “wielki człowiek”, as well as “wielki mały człowiek” or “wielki jest człowiek”, since one lexical item was allowed to appear between “wielki” and “człowiek”. In order to set the Slop factor, choose the value using the slider located below the search bar. In this document the position of the slider is indicated by the following expression (Slop factor = 1,2,3...).
  
-[[http://tinyurl.com/qzjxvbl|(wielki człowiek)=1]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/wielki%20cz%C5%82owiek/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/1/medium/0|wielki człowiek (Slop factor = 1)]]
  
 {{:wielki_czlowiek.jpg|}} {{:wielki_czlowiek.jpg|}}
Line 67: Line 67:
 Setting higher slop value is recommended in the case of words that might be located further away from one another in the sentence. For instance, (ciebie kocham)=2 returns the following results: “ciebie też bardzo kocham Ciebie w nich kocham" or "ciebie i nadal kocham". Setting higher slop value is recommended in the case of words that might be located further away from one another in the sentence. For instance, (ciebie kocham)=2 returns the following results: “ciebie też bardzo kocham Ciebie w nich kocham" or "ciebie i nadal kocham".
  
-[[http://tinyurl.com/pdjwubs|(ciebie kocham)=2]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/ciebie%20kocham/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/2/medium/0|ciebie kocham (Slop factor = 2)]]
  
 {{:ciebie_kocham.jpg|}} {{:ciebie_kocham.jpg|}}
Line 73: Line 73:
 Note: Remember that the slop factor is the total number of the intervening words. Thus, in the case of queries  which are longer than two elements, one should take into account all the possible positions of a word within the string. For instance, in order to retrieve the Polish proverb “Nosił wilk razy kilka, ponieśli i wilka”, using the following string of words: “nosił razy ponieśli”, one should set the slop factor to “4”, since there are three intervening words and one comma that need to be added. Please note that **punctuation marks count as words**. Note: Remember that the slop factor is the total number of the intervening words. Thus, in the case of queries  which are longer than two elements, one should take into account all the possible positions of a word within the string. For instance, in order to retrieve the Polish proverb “Nosił wilk razy kilka, ponieśli i wilka”, using the following string of words: “nosił razy ponieśli”, one should set the slop factor to “4”, since there are three intervening words and one comma that need to be added. Please note that **punctuation marks count as words**.
  
-[[http://tinyurl.com/otkn2hb|(nosił razy ponieśli)=4]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/nosi%C5%82%20razy%20ponie%C5%9Bli/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/true/4/medium/0|nosił razy ponieśli (Slop factor = 4)]]
  
 {{:nosil_wilk.jpg|}} {{:nosil_wilk.jpg|}}
Line 79: Line 79:
 ==== Slop factor with relaxed order ==== ==== Slop factor with relaxed order ====
  
-This operator combines the regular slop factor with the option to change the order of the words typed in the searchbox. In the format of the query, the tilde “~” is used instead of the sign of equation.+This operator combines the regular slop factor with the option to change the order of the words typed in the searchbox. In NKJP 2unchecking the "order" box is necessary to run relaxed-order queries. Please note that in this Wiki the relaxed order is represented by the expression ("order" uchecked).
  
-[[http://tinyurl.com/oxktqgb|(problem jest)~2]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/problem%20jest/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|problem jest (Slop factor = 2) ("order" unchecked) ]]
  
 {{:jest_problem.jpg|}} {{:jest_problem.jpg|}}
  
-Setting the slop factor to "0can be used to take advantage of the relaxed word order option without any intervening words.+Unchecking the "order" box with Slop factor set to 0 can be used to take advantage of the relaxed word order option without any intervening words.
  
-[[http://tinyurl.com/n9gzuzc|(ciebie na)~0]]+[[http://tinyurl.com/n9gzuzc|ciebie na ("order" unchecked)]]
  
 {{:ciebie_na.jpg|}} {{:ciebie_na.jpg|}}
  
-Slop factor can be combined with other functionalities. For instance, "(<lemma=widzieć> kogo)~2" will fetch examples of every form of the verb "widzieć" combined with "kogo", appearing in any order, and separated by the maximum of two intervening words.+Slop factor can be combined with other functionalities. For instance, "<lemma=widzieć> kogo (Slop factor = 2) ("order" uchecked)" will fetch examples of every form of the verb "widzieć" combined with "kogo", appearing in any order, and separated by the maximum of two intervening words.
  
-[[http://tinyurl.com/nufj5s6|(<lemma=widzieć> kogo)~2]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/%3Clemma%3Dwidzie%C4%87%3E%20kogo/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|<lemma=widzieć> kogo (Slop factor = 2) ("order" uchecked)]]
  
 {{:kogo_widze.jpg|}} {{:kogo_widze.jpg|}}
Line 101: Line 101:
 This operator excludes specified variants of query terms from the results. Consequently, it must be combined with query types that produce variation in the results. Negation is marked by a pipe sign with an exclamation mark “|!”, which is to be read as “but not”. The example shows how it is used with a base form query. The specified form of the word is excluded from the results: This operator excludes specified variants of query terms from the results. Consequently, it must be combined with query types that produce variation in the results. Negation is marked by a pipe sign with an exclamation mark “|!”, which is to be read as “but not”. The example shows how it is used with a base form query. The specified form of the word is excluded from the results:
  
-[[http://tinyurl.com/necsebo|<lemma=prosić>|!proszę]]+[[https://tinyurl.com/zumgzkr|<lemma=prosić>|!proszę]]
  
 {{:prosic.jpg|}} {{:prosic.jpg|}}
Line 113: Line 113:
 A full stop “.” is a wild card, it stands for any sign. A full stop used within any word will replace a single letter. In tha case of "zaka.ę", it may be "ż" or "ł". A full stop “.” is a wild card, it stands for any sign. A full stop used within any word will replace a single letter. In tha case of "zaka.ę", it may be "ż" or "ł".
  
-[[http://tinyurl.com/pd26448|zaka.ę]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/zaka.%C4%99/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|zaka.ę]]
  
 {{:zakaze.jpg|}} {{:zakaze.jpg|}}
Line 121: Line 121:
 A plus “+” is a quantifier: the preceding sign can appear one or more times. A plus “+” is a quantifier: the preceding sign can appear one or more times.
  
-[[http://tinyurl.com/q2ax63f|wan+a]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/wan%2Ba/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|wan+a]]
  
 {{:wanna.jpg|}} {{:wanna.jpg|}}
Line 129: Line 129:
 An asterisk “*” is another quantifier: the preceding sign can appear zero or more times. Thus, "pan*a" will fetch both "pana" ("n" repeated zero times) and "panna" ("n" repeated more than zero times). An asterisk “*” is another quantifier: the preceding sign can appear zero or more times. Thus, "pan*a" will fetch both "pana" ("n" repeated zero times) and "panna" ("n" repeated more than zero times).
  
-[[http://tinyurl.com/olo924a|pan*a]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/pan*a/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|pan*a]]
  
 {{:panna.jpg|}} {{:panna.jpg|}}
Line 141: Line 141:
 “.+” means that in this part of the query any sign or sequence of signs may appear. “.+” means that in this part of the query any sign or sequence of signs may appear.
  
-[[http://tinyurl.com/ocf7pcv|t.+m]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/t.%2Bm/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|t.+m]]
  
 {{:tam.jpg|}} {{:tam.jpg|}}
Line 147: Line 147:
 ".*" is used when either nothing or any combination of signs may appear after the sequence typed in by the user: ".*" is used when either nothing or any combination of signs may appear after the sequence typed in by the user:
  
-[[http://tinyurl.com/olxcb9h|tren.*]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/tren.*/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|tren.*]]
  
 {{:tren.jpg|}} {{:tren.jpg|}}
Line 153: Line 153:
 using more than one dot allows the user to replace a set number of characters using more than one dot allows the user to replace a set number of characters
  
-[[http://tinyurl.com/pqszb7b|b....s]]+[[http://pelcra.clarin-pl.eu/NKJP/#search/pl/b....s/true/-1/-1/0/20/-1/1/1000/0/-1/0/1000/.*/-3,-2,-1,1,2,3/3/-1/-1/-1/false/2/medium/0|b....s]]
  
 {{:biceps.jpg|}} {{:biceps.jpg|}}
slopeq_for_nkjp.1460977130.txt.gz · Last modified: 2016/04/18 12:58 by pezik