The documents are returned sorted on relevance depending on order, proximity, frequency of terms.
Default search behavior
By default, all search terms are optional. It behaves like an OR logic. Objects that contain the more terms are rated higher in the results and will appear first in their type. For example, wiki forum will find:
- objects that include both terms
- objects that include the term wiki
- objects that include the term forum
Requiring terms
Add a plus sign ( + ) before a term to indicate that the term must appear in results. Example: +wiki forum will find objects containing at least wiki. Objects with both terms and many occurences of the terms will appear first.
Excluding terms
Add a minus sign ( - ) before a term to indicate that the term must not appear in the results. To reduce a term's value without completely excluding it, use a tilde. Example: -wiki forum will find objects that do not contain wiki but contain forum
Grouping terms
Use parenthesis ( ) to group terms into subexpressions. Example: +wiki +(forum blog) will find objects that contain wiki and forum or that contain wiki and blog in any order.
Finding phrases
Use double quotes ( " " ) around a phrase to find terms in the exact order, exactly as typed. Example: "Alex Bell" will not find Bell Alex or Alex G. Bell.
Using wildcards
Add an asterisk ( * ) after a term to find objects that include the root word. For example, run* will find:
- objects that include the term run
- objects that include the term runner
- objects that include the term running
Reducing a term's value
Add a tilde ( ~ ) before a term to reduce its value indicate to the ranking of the results. Objects that contain the term will appear lower than other objects (unlike the minus sign which will completely exclude a term). Example: +wiki ~forum will rate an object with only wiki higher that an object with wiki and forum.
Changing relevance value
Add a less than ( < ) or greater than ( > ) sign before a term to change the term's contribution to the overall relevance value assigned to a object. Example: +wiki +(>forum < blog) will find objects that contain wiki and forum or wiki and blog in any order. wiki forum will be rated higher.
Re: Re: Re: Representation language trade-off results
I share Stephan's first comment that besides the four presented possibilities - DFDL, SDF, BinX and EAST - the idea for this trade-off was to have a fifth one consisting of a new language to be designed from scratch, even if the comparison would not be very fair because it would be based on hypothetical (not actual) characteristics of that new language. I propose that it is included in the trade-off, even if dismissed briefly upfront.
I also share the view that this issue is different from the one in trade-off 0066. The two issues are related but still independent enough to be looked at separately. Trade-off 0066 resulted from a RID/idea from Dominic Lowe and was "only" about whether instead of having new schemas for each new product type, we could have a very generic schema able to represent all possible product types (quite challenging and, I agree, of dubious added value because it would have to be really generic) and then instances of this generic XML schema for each new product type.
Anyway, the real issue is then whether the decision of having 1 schema + M instances instead of N schemas influences trade-off 0070 and in particular the need for a new language to be designed. I don't think so.
We should also not forget the comment that Bernard Buckl from DLR made during the SRR, that there are probably strong (unavoidable) reasons for languages like DFDL and SDF to use annotations to be able to represent binary data. I think this is the really interesting discussion, but it is more relevant to trade-off 0066.
Regarding this trade-off, I had a few comments of my own:
- I am slightly concerned about the immaturity of DFDL and in particular of the Daffodil parser. I'm sure we will see updates to both in the coming months and this is not optimal considering the upcoming SAFE developments.
- Maybe the trade-off could consider other parameters in the analysis, such as "maturity", "complexity" and "limitations". DFDL, for example, would probably score low on maturity, EAST would score low on complexity (i.e. it's complex), and BinX would score low on limitations (it has several).
- Stephan also mentioned this. I found it curious that BinX uses a master schema and then representation information consists of XML instances/documents. This is exactly the subject of trade-off 0066.
About the conclusion of the trade-off, no doubt about changing to DFDL as it stands.
Paulo