Jason Hull02/22/12
711 views
0 replies
The probability of clicking on a search engine result is a function of keyword density and the relative placement. Ideally, you should be looking at your top 100 search terms and getting a RUP score for each of them.
Rafał Kuć02/21/12
1619 views
0 replies
Learn how Rafal Kuc built a simple photo search functionality that pulled JPEG metadata such as aperture, shutter speed, focal length or ISO value. It was all built using the open source Apache projects Solr and Tika.
Matt Overstreet02/20/12
2609 views
0 replies
Building a pure Javascript search UI can start to help clean up and organize your code. At some point we manipulate the DOM so much we may as well have just created it in Javascript in the first place. Learn how to efficiently render search pages, protect your search results from spam, and forget about SEO on search pages.
Mitch Pronschinske02/19/12
2386 views
0 replies
The question that is coming up for one company who provides a free Solr performance monitoring solution, is how to best report the changes in an index's size. This poll takes into account what DZone audience members would want to see in the optimal Solr index monitoring solution.
Mitch Pronschinske02/18/12
1599 views
0 replies
This quick search reference is meant to help with some of the logical inconsistencies in the parameter names (e.g. query type parameter is "qt" but query parser is "deftype"). The reference is meant to help you remember these oddities that are hard to codify in your brain.
Tony Russell-rose02/17/12
1914 views
1 replies
Icons are form of symbolic language. The purpose of this section is to review some of the issues involved in developing a grammar for icons, and to explore the possibilities of applying such a grammar.
Peter Karich02/16/12
1753 views
0 replies
Several times per month there are multiple questions regarding the ElasticSearch query structure posted on the user group. Although there are already good docs explaining this in depth, I think a bird's eye view of the Query DSL is necessary to understand what is written there. This brief tutorial should clarify things for you.
Giorgio Sironi02/16/12
1823 views
0 replies
Learn how to prototype a spam filter with one of the simplest machine learning approaches, the naive Bayes classifier.
Mitch Pronschinske02/15/12
1966 views
0 replies
Often, there are sound reasons why using DIH and/or Tika to index data in Solr is not optimal. For those situations SolrJ may be the most appropriate, so Erick Erickson has created a skeletal program to help you understand a real use case for SolrJ.
Geoffrey Papilion02/14/12
2020 views
0 replies
Our search index has grown in the last few months by 20% and our JVM and Solr setups were beginning to groan under the weight of the data. While doing your JVM tuning and Solr configuration you should also note that if you're performing a query against a large index and you want to use dismax, you should try the strategy outlined in this blog.
Tony Russell-rose02/14/12
2300 views
0 replies
Learn about some of the key challenges in text analytics, describe some of Endeca’s current research in this area, examine the current state of the text analytics market and explore some of the prospects for the future.
Yonik Seeley02/13/12
1355 views
0 replies
Advanced Filter Caching is a relatively new feature in Solr. Get a hands-on tutorial by none other than the creator of Solr, Yonik Seeley.
Mitch Pronschinske02/11/12
2816 views
0 replies
A very handy new feature called 'query time joining' is coming to Lucene sooner than anticipated. While it was confirmed last month that it would be in Lucene 4.0, the most recent news from Apache indicates that it will be included sooner than expected in Lucene 3.6.
Lynda Moulton02/10/12
2087 views
0 replies
Learn why it's important, especially in an industry like healthcare, to have people who know how to search properly and how important it is to have usable search technology.
Rafał Kuć02/08/12
2033 views
0 replies
Most of the time in Solr's Dismax query parser we use parameters like qf, pf or mm forgetting about a very useful parameter which allows us to control how the lower scoring fields are treated – the tie parameter. In this article you'll learn how this parameter can be put to good use.
Tony Russell-rose02/07/12
1676 views
0 replies
Search is more than just findability. So why the fixation with findability? Out of 104 enterprise search scenarios, less than 2% were categorised as findability tasks. In this post you will learn about the broader, overall information goals for most search efforts.
Rafał Kuć02/06/12
2364 views
0 replies
One of the features of the latest Solr version (3.5) is the ability to identify the language of the document during its indexation. In today's entry we will see how Apache Solr works together with Apache Tika to identify the language of the documents.
Tony Russell-rose02/05/12
2084 views
0 replies
Read several articles by search experts like Charlie Hull and Tyler Tate. A great resource for enterprise search followers.
Mitch Pronschinske02/03/12
6509 views
1 replies
Learn about the controversy that emerged in the early days of the Java 7 GA release because of the effect it had on Apache Lucene - from the perspective of Uwe Schindler who blogged about the bug.The recent release of Java 7 and its testing with Lucene...
Rafał Kuć02/02/12
1532 views
0 replies
One of the configuration variables we can find in the solrconfig.xml file is maxBooleanClauses, which specifies the maximum number of boolean clauses that can be combined in a single query. The question is, do I have to worry about it when using filters in Solr? Find out.
Giorgio Sironi02/02/12
3662 views
0 replies
R is a language for statistical computing. In a world of big data and scientific approaches to startup ideas, you can have the advantage of a tool in your box for statistical analysis and mathematical computations that is more powerful than a general purpose...
Eric Genesky02/01/12
2101 views
0 replies
Today, Lucid Imagination releases LucidWorks Cloud, a SaaS version of their LucidWorks Enterprise platform, for general availability.
Tony Russell-rose02/01/12
2549 views
0 replies
I’ve been thinking recently about the role of ‘advanced search’, i.e.
the practice whereby some sites withhold certain aspects of
functionality from ‘standard’ search and accommodate them instead within
a separate search experience. Now,...
Mitch Pronschinske02/01/12
2274 views
0 replies
Mark your calendars today! The largest worldwide conference dedicated to Lucene and Solr will take place in Boston May 7-10.
The 2012 conference will build on the success of last year’s Lucene
Revolution...
Jason Hull01/30/12
2841 views
0 replies
Recently, we had a project where we helped a client index a corpus of
Chinese language documents in Solr. We have asked Dan Funk, a committer
to Project Blacklight
to provide a guest blog post for us on the details of how to approach
indexing Chinese,...