User Tools

Site Tools


slopeq_for_bnc

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
slopeq_for_bnc [2015/07/08 17:56] – [Wild card and quantifiers] gaszewskislopeq_for_bnc [2017/02/03 01:15] (current) – [Wild card and quantifiers] mmolenda
Line 5: Line 5:
 ===== Surface queries ===== ===== Surface queries =====
  
-This is the simplest type of queries. Words are written in the query box in their plain orthographic form. The results are occurrences of the particular forms submitted in the query. Compare the query and example results:+This is the simplest type of queries. Words are written in the query box in their plain orthographic form. The results are occurrences of the particular forms submitted in the query. Compare the query and the results:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/acknowledge/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|acknowledge]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/acknowledge/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|acknowledge]]
 ;#; ;#;
  
Line 21: Line 21:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/in%20vain/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|in vain]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/in%20vain/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|in vain]]
 ;#; ;#;
  
Line 40: Line 40:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Ddecide%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|<lemma=decide>]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Ddecide%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|<lemma=decide>]]
 ;#; ;#;
  
Line 55: Line 55:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dtake%3E%20advantage/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|<lemma=take> advantage]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dtake%3E%20advantage/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|<lemma=take> advantage]]
 ;#; ;#;
  
Line 74: Line 74:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/issue%7Cmatter/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|issue|matter]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/issue%7Cmatter/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|issue|matter]]
 ;#; ;#;
  
Line 87: Line 87:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dconsist%3E%7C%3Clemma%3Dcomprise%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|<lemma=consist>|<lemma=comprise>]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dconsist%3E%7C%3Clemma%3Dcomprise%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|<lemma=consist>|<lemma=comprise>]]
 ;#; ;#;
  
Line 104: Line 104:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/benefit%7Cprofit%7Cgain%20from/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|benefit|profit|gain from]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/benefit%7Cprofit%7Cgain%20from/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|benefit|profit|gain from]]
 ;#; ;#;
  
Line 119: Line 119:
 ==== Slop factor ==== ==== Slop factor ====
  
-This important functionality allows to search for a discontinuous string of words. The query specifies how many words may intervene between the terms of the query. The searched words are taken into round brackets and the allowed number of intervening words is given after the equation signe.g.:+This important functionality allows to search for a discontinuous string of words. The query specifies how many words may intervene between the terms of the query. The allowed number of intervening words is set with the slider located below the search box. In this Wikithe value of the Slop factor is indicated by the following expression: (Slop factor = 1,2,3... etc.). 
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/(adopt%20policy)%3D2/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|(adopt policy)=2]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/adopt%20policy/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/2/-1/-1/-1/-1/-1/-1|adopt policy (Slop factor = 2)]]
 ;#; ;#;
  
Line 137: Line 137:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/(wait%20in%7Con)%3D1/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|(wait in|on)=1]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/wait%20in%7Con/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/1/-1/-1/-1/-1/-1/-1|wait in|on (Slop factor = 1)]]
 ;#; ;#;
  
Line 151: Line 151:
 ==== Slop factor with relaxed order ==== ==== Slop factor with relaxed order ====
  
-These queries allow intervening words up to the specified number and the query terms may appear in any order. In the format of the query the tilde “~” is used instead of the sign of equation. +These queries allow intervening words up to the specified number and the query terms may appear in any order. This option is activated by unchecking the "order" box located next to the Slop factor slider ("order" unchecked).
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/(wrong%20approach)~2/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|(wrong approach)~2]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/wrong%20approach/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/false/2/-1/-1/-1/-1/-1/-1|wrong approach (Slop factor = 2) ("order" unchecked)]]
 ;#; ;#;
  
Line 164: Line 163:
 |5 |  What was|  wrong with that approach  | ? | |5 |  What was|  wrong with that approach  | ? |
  
-The relaxed order in pure form is available with the number 0, i.e. with no intervening words:+The relaxed order in pure form is available when the Slop factor is set to 0:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/(he%20would)~0/-1/0/1000/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|(he would)~0]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/he%20would/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/false/0/-1/-1/-1/-1/-1/-1|he would ("order" unchecked)]]
 ;#; ;#;
  
Line 180: Line 179:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/(%3Clemma%3Dabuse%3E%20%3Clemma%3Dright%3E)~1/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|(<lemma=abuse> <lemma=right>)~1]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dabuse%3E%20%3Clemma%3Dright%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/false/1/-1/-1/-1/-1/-1/-1|<lemma=abuse> <lemma=right> (Slop factor = 1) ("order" unchecked)]]
 ;#; ;#;
  
Line 197: Line 196:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dremain%3E%7C!remaining/-1/0/100/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|<lemma=remain>|!remaining]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dremain%3E%7C!remaining/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/false/1/-1/-1/-1/-1/-1/-1|<lemma=remain>|!remaining]]
 ;#; ;#;
  
Line 230: Line 229:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/develop.*/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|develop.*]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/develop.*/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|develop.*]]
 ;#; ;#;
  
Line 244: Line 243:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/develop.%2B/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|develop.+]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/develop.%2B/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|develop.+]]
 ;#; ;#;
  
Line 257: Line 256:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/d....e/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|d....e]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/d....e/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|d....e]]
 ;#; ;#;
  
Line 273: Line 272:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dpain.%2B%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|<lemma=pain.+>]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dpain.%2B%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|<lemma=pain.+>]]
 ;#; ;#;
  
Line 289: Line 288:
  
 ;#; ;#;
-[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/vari.%2B%7C!variety/-1/0/500/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/-1/-1/-1/-1/-1/-1/-1|vari.+|!variety]]+[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/vari.%2B%7C!variety/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|vari.+|!variety]]
 ;#; ;#;
  
Line 307: Line 306:
 SlopeQ for the BNC makes use of the BNC tagset. We will not present the tagset in full here, but only discuss its application for our search engine. A list of all the tags is available [[http://www.natcorp.ox.ac.uk/docs/c5spec.html|online]]. SlopeQ for the BNC makes use of the BNC tagset. We will not present the tagset in full here, but only discuss its application for our search engine. A list of all the tags is available [[http://www.natcorp.ox.ac.uk/docs/c5spec.html|online]].
  
-The tags are three-symbol codes, which classify the exact grammatical form of the given form in the corpus. For example, “NN1” marks a singular common noun, “AJC” marks a comparative adjective and “VVG” marks an –ing form of a lexical verb. You can use exact tags like these, but it is even better to make underspecified queries by means of the wild card (and quantifiers). This is possible because the tags form a neatly ordered system. Thus, all tags starting with “N” mark nouns, all tags starting with “V” mark verbs, in particular lexical verbs are marked by tags with “VV” at the beginning, and all tags starting with “AJ” mark adjectives etc.+The tags are three-symbol codes, which classify the exact grammatical form of the given word in the corpus. For example, “NN1” marks a singular common noun, “AJC” marks a comparative adjective and “VVG” marks an –ing form of a lexical verb. You can use exact tags like these, but it is even better to make underspecified queries by means of the wild card (and quantifiers). This is possible because the tags form a neatly ordered system. Thus, all tags starting with “N” mark nouns, all tags starting with “V” mark verbs, in particular lexical verbs are marked by tags with “VV” at the beginning, and all tags starting with “AJ” mark adjectives etc.
  
 All grammatical queries have the same formula. They are written in triangular brackets as an equation “pos=” (pos stands for part-of-speech). The tag or regex tag is put immediately after the equation sign. Our first example will use an exact tag, the one for the third person singular present tense of lexical verbs. All grammatical queries have the same formula. They are written in triangular brackets as an equation “pos=” (pos stands for part-of-speech). The tag or regex tag is put immediately after the equation sign. Our first example will use an exact tag, the one for the third person singular present tense of lexical verbs.
 +
 +;#;
 +[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Cpos%3DVVZ%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|<pos=VVZ>]]
 +;#;
 +
 +^  # ^  Left  ^ Match ^  Right  ^
 +|1 |  I 'm sure if ever the occasion|  arises  | when I want advice on insurance , you 're the first person I 'll come to . ’|
 +|2 |  they can move on to full doctor status and for many students the chance to experience life in another country more than|  makes  | up for the extra years of study .|
 +|3 |  It also|  requires  | only a fraction of the fees of other European Universities .|
 +|4 |  The trust has now drawn up detailed plans and|  claims  | a living museum at Bletchley Park could attract at least 100,000 visitors every year .|
 +|5 |  I do n't suppose anyone really|  wants  | to see him , do they ?|
 +
 +A formula covering a series of tags can be obtained by using the wild card(s). For example, in order to search for nouns in general you can input <pos=N.+> or <pos=N.*> or <pos=N..>. Of course, these are not identical as regex queries, but because all tags have three symbols, there will be no difference in the results. This use will be illustrated in further examples. 
 +
 +It is possible to combine grammatical search with other functionalities. The following query yields sequences of the word //beautiful// followed by any noun.
 +
 +;#;
 +[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/beautiful%20%3Cpos%3DN.%2B%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|beautiful <pos=N.+>]]
 +;#;
 +
 +^  # ^  Left  ^ Match ^  Right  ^
 +|1 |  A paint-effect wall makes a|  beautiful backdrop  | , whether you try your hand at sponging or go in for a more adventurous colour-wash finish . |
 +|2 |  HOMES on a north Belfast street have rooms with a|  beautiful view  | today . |
 +|3 |  Miss Harker removed her bonnet , a|  beautiful item  | with long blue ribbons , and looked round for somewhere to hang it .|
 +|4 |  the woman could only hope that moving her right away from the influence of the people she went around with into these|  beautiful surroundings  | might bring her back to herself .|
 +|5 |  And you gave me everything , my|  beautiful Maggie  | . |
 +
 +This kind of query is very good for researching collocates of a given word that are from a specific grammatical class.
 +
 +The next query involves the slop factor and a base form. The results are sequences of any form of the word //derive// followed by a preposition with one intervening word possible.
 +
 +;#;
 +[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dderive%3E%20%3Cpos%3DPRP%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/1/-1/-1/-1/-1/-1/-1|<lemma=derive> <pos=PRP> (Slop factor = 1)]]
 +;#;
 +
 +^  # ^  Left  ^ Match ^  Right  ^
 +|1 |  More frequently these|  derive from  | species still found in tropical and semi-tropical habitats |
 +|2 |  The payments pursuant to the discretionary power|  derive from  | an overseas source and come within Drummond v Collins 6 TC 525 .|
 +|3 |  As a result , it is impossible to|  derive egalitarianism in  | the Marxist sense from a Biblical foundation .|
 +|4 |  the development of local government reflected economic organization and the political processes which|  derived from  | it . |
 +|5 |  A cosmopolitan group|  derived mainly from  | crosses between bush roses and wild species , many noted for their vigour , scent and exuberant flowering .|
 +|6 |  The model is|  derived by  | the processes of data analysis|
 +
 +It is also possible to apply base form query and grammatical query to a single term. The labels “lemma=” and “pos=” need to be taken in the same pair of brackets then. The following query yields occurrences of all forms of the verb //approach//, but not of the noun.
 +
 +;#;
 +[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/%3Clemma%3Dapproach%20pos%3DV..%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|<lemma=approach pos=V..>]]
 +;#;
 +
 +^  # ^  Left  ^ Match ^  Right  ^
 +|1 |  As they|  approach  | a grazing herd of , say , wildebeest , they spread out in line abreast and begin the stalk .|
 +|2 |  she added : ‘ A customer is a customer , I|  approach  | them in all the same way .|
 +|3 |  When The Art Newspaper|  approached  | the British Museum they pointed out that they do not possess the Altar of Cybele ,|
 +|4 |  When CDC 's intention became clear , it was|  approached  | by a variety of potential partners , Ousley says , and ‘ came very close to changing ’ its decision|
 +|5 |  We think now , as Christmas|  approaches  | , and that elusive fishkeeping present becomes top priority , is the perfect time to get the unbiased recommendations of our experts as to what they would buy , given the chance .|
 +
 +In general, it is possible to add the grammatical specification to any other kind of query term by writing it immediately after the term (without a space). Below we have a regex query for words ending in //-fish//, but they must be tagged as nouns. In this way we exclude adjectives like //selfish//.
 +
 +;#;
 +[[http://pelcra.clarin-pl.eu/SlopeqBNC/#search/pl/.%2Bfish%3Cpos%3DN.*%3E/-1/0/20/-1/-1/-1/-1/-1/1000/NN.*/-1,1/4/true/0/-1/-1/-1/-1/-1/-1|.+fish<pos=N.*>]]
 +;#;
 +
 +^  # ^  Left  ^ Match ^  Right  ^
 +|1 |  Experienced , mature|  Angelfish  | can fill two or three Amazon Sword leaves with eggs|
 +|2 |  Over-exploitation has led to a collapse in numbers of|  bluefish  | tuna , swordfish and cod in the Atlantic .|
 +|3 |  A popular aquarium fish , the range of the Redfin|  Butterflyfish  | stretches from the tropical Indo-Pacific to South Africa .|
 +|4 |  We have a 10 gallon tank with an undergravel filter containing two common goldfish , one fancy|  goldfish  | and a Moor .|
 +|5 |  Ian Lucas spots some new opportunities with mid-price|  catfish  | .|
 +|6 |  Echinoderms , like|  starfish  | , urchins and sea-cucumbers can be found here .|
 +
  
slopeq_for_bnc.txt · Last modified: 2017/02/03 01:15 by mmolenda