Try now Demo en
  • en
  • fr
  • de
  • Solutions
    • Use cases
      • Modern IntranetBuild strong internal culture & sense of belonging
      • Collaboration PlatformEfficient project management & teamwork
      • Social NetworkEngage users & recognize contributions
      • Knowledge managementCentralize & share your company knowledge
      • Employee PortalEngage your community
    • Industries
      • Public Sector
      • Networks
      • Education
      • Enterprises
  • Product
    • Overview
      • Digital workplaceFeatures & capabilities
      • Why eXoKey differentiators
      • InternationalisationMultilingual environments
      • MobileBranded mobile applications
    • Platform
      • No CodeTailor to your needs without code
      • IntegrationsAvailable connectors & extension capabilities
    • Technology
      • ArchitectureAn overview of eXo Platform technology
      • SecurityeXo Platform security measures
      • Open sourceComponents & licensing
  • Offers
    • EnterpriseMore than 250 users
    • ProfessionalLess than 250 users
    • OEM EditionFor software vendors & service providers
    • ServicesDiscover eXo professional services
  • Resources
    • Resource center
      • Case studies
      • White Papers
      • Datasheets
      • Videos
    • Migration guide
      • Alternative to Microsoft 365
      • Alternative to Sharepoint
      • Alternative to Workplace from Meta
    • From The Blog
      • eXo Platform 7.0 is released
      • eXo Platform Community Edition 7.0 is released
      • Cloud Vs On-premise Digital Workplace: Which one is right for your business?
  • Community
    • CommunityJoin our online community platform
    • DownloadLaunch eXo platform in your infrastructure
    • Source codeSource code on github
    • FAQsAbout the software, the community and our offers
    • REST APIs & DocumentationAll REST APIs available in eXo Platform
  • Company
    • About us
    • Customers
    • Partners
    • Contact us
    • Newsroom
  • Menu mobile
    • Pricing
    • About us
    • Careers
    • Resource center
    • Blog
    • Contact us
    • Try eXo
Use cases
  • Modern Intranet Build strong internal culture & sense of belonging
  • Collaboration Platform Efficient project management & teamwork
  • Social Network Engage users & recognize contributions
  • Knowledge management Centralize & share your company knowledge
  • Employee Portal Engage your community
Industries
  • Public Sector
  • Networks
  • Education
  • Enterprises
Overview
  • Digital workplace Features & capabilities
  • Why eXo Key differentiators
  • Internationalisation Multilingual environments
  • Mobile Branded mobile applications
Platform
  • No Code Tailor to your needs without code
  • Integrations Available connectors & extension capabilities
Technology
  • Architecture An overview of eXo Platform technology
  • Security eXo Platform security measures
  • Open source Components & licensing
Enterprise More than 250 users
Professional Less than 250 users
OEM Edition For software vendors & service providers
Services Discover eXo professional services
Resource center
  • Case studies
  • White Papers
  • Datasheets
  • Videos
Migration guide
  • Alternative to Microsoft 365
  • Alternative to Sharepoint
  • Alternative to Workplace from Meta
From The Blog
  • eXo Platform 7.0 is released
  • eXo Platform Community Edition 7.0 is released
  • Cloud Vs On-premise Digital Workplace: Which one is right for your business?
Community Join our online community platform
Download Launch eXo platform in your infrastructure
Source code Source code on github
FAQs About the software, the community and our offers
REST APIs & Documentation All REST APIs available in eXo Platform
About us
Customers
Partners
Contact us
Newsroom
Pricing
About us
Careers
Resource center
Blog
Contact us
Try eXo
  1. Accueil
  2. Uncategorized
  3. Use ElasticSearch to search and use external sources, like Wikipedia, inside eXo Platform

Use ElasticSearch to search and use external sources, like Wikipedia, inside eXo Platform

Since its version 4, eXo Platform has added a new unified search, which greatly improves its search capabilities. All the platform’s resources (contents, files, wiki pages, etc.) can now be easily found from a single and centralized location.

01-default-unified-search

Besides these out-of-the-box capabilities, a new API allows creation of custom search connectors in order to extend the search scope and enrich the results. This blog post explains how to implement and configure such a connector.

For this blog post, the search connector will retrieve data indexed by ElasticSearch, a highly powerful and easy to use search engine. It is of course up to you to decide what your search connector returns (data indexed by another search engine, data from a database, other custom data stored in eXo, etc.)

ElasticSearch

The first step if we want to use ElasticSearch is to install and configure it! The only thing to do here is to download it, extract it, and start it with:

bin/elasticsearch.sh -f

Index data in ElasticSearch

As is, ElasticSearch is empty, no data has been indexed. So we need to feed it. For this purpose, we will use the Wikipedia River plugin. A river is an ElasticSearch component which feeds ElasticSearch with data to index. The Wikipedia River simply feeds ElasticSearch with Wikipedia pages.

After stopping your ElasticSearch server you can install the plugin with:

bin/plugin -install elasticsearch/elasticsearch-river-wikipedia/1.3.0

After restarting ElasticSearch you should see logs similar to the following:

[2013-11-27 11:55:48,716][INFO ][node] [It, the Living Colossus] version[0.90.7], pid[14776], build[36897d0/2013-11-13T12:06:54Z]
[2013-11-27 11:55:48,716][INFO ][node] [It, the Living Colossus] initializing ...
[2013-11-27 11:55:48,725][INFO ][plugins] [It, the Living Colossus] loaded [river-wikipedia], sites []
[2013-11-27 11:55:50,632][INFO ][node] [It, the Living Colossus] initialized
[2013-11-27 11:55:50,632][INFO ][node] [It, the Living Colossus] starting ...
[2013-11-27 11:55:50,718][INFO ][transport] [It, the Living Colossus] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.0.5:9300]}

This ensures that the Wikipedia River plugin is correctly installed (loaded [river-wikipedia]).

We can now start indexing Wikipedia pages in ElasticSearch by creating the river with a REST call (we are using curl here; feel free to use your favorite tool):

curl -XPUT localhost:9200/_river/my_river/_meta -d '
{
  "type" : "wikipedia"
}
'

A lot of data is now being indexed by ElasticSearch (yes, Wikipedia is a huge source of data :)). You can check this by executing a search with:

curl -XGET 'http://localhost:9200/_search?q=test'

Warning: the Wikipedia River will index a lot of data. You should stop the river after a few minutes to avoid filling your entire disk space ;-). This can be done by deleting the river with:

curl -XDELETE localhost:9200/_river/my_river

Now that we have data indexed by ElasticSearch, let’s dig into the eXo search connector!

eXo search connector

A search connector is a simple class that extends org.exoplatform.commons.api.search.SearchServiceConnector and implements the “search” method:

package org.exoplatform.search.elasticsearch;

import ...

public class ElasticSearchConnector extends SearchServiceConnector {

  public ElasticSearchConnector(InitParams initParams) {
    super(initParams);
  }

  @Override
  public Collection<SearchResult> search(SearchContext context, String query, Collection<String> sites, int offset, int limit, String sort, String order) {
      // Fetch data
  }
}

It needs to be declared in the eXo configuration, either in an extension or directly in the jar which will contain the connector class. Let’s go for the jar method:

  • add the class in your jar
  • add a file named configuration.xml in conf/portal in your jar with the following content (the “type” tag contains the FQN of your connector class):
<configuration
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://www.exoplatform.org/xml/ns/kernel_1_2.xsd http://www.exoplatform.org/xml/ns/kernel_1_2.xsd"
   xmlns="http://www.exoplatform.org/xml/ns/kernel_1_2.xsd">

    <external-component-plugins>
      <target-component>org.exoplatform.commons.api.search.SearchService</target-component>
      <component-plugin>
        <name>ElasticSearchConnector</name>
        <set-method>addConnector</set-method>
        <type>org.exoplatform.search.elasticsearch.ElasticSearchConnector</type>
        <description>ElasticSearch Connector</description>
        <init-params>
          <properties-param>
             <name>constructor.params</name>
             <property name="searchType" value="wikipedia"/>
             <property name="displayName" value="Wikipedia"/>
          </properties-param>
        </init-params>      
      </component-plugin>
    </external-component-plugins>

</configuration>

We now have the skeleton of our search connector. The last step is to implement the search method.

Fetching results from ElasticSearch

We need to call ElasticSearch in order to fetch Wikipedia pages based on the input parameters of the search (query text, offset, limit, sort field, sort order). ElasticSearch provides a Java Client API (TransportClient). Sadly, it depends on Lucene artifacts, and since eXo Platform already embeds Lucene artifacts that are not necessarily in the same version as the ones needed by ElasticSearch, it can cause conflicts. Instead we will directly use the REST API:

package org.exoplatform.search.elasticsearch;

import org.apache.commons.io.IOUtils;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.exoplatform.commons.api.search.SearchServiceConnector;
import org.exoplatform.commons.api.search.data.SearchContext;
import org.exoplatform.commons.api.search.data.SearchResult;
import org.exoplatform.container.xml.InitParams;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;

import java.io.StringWriter;
import java.util.*;

public class ElasticSearchConnector extends SearchServiceConnector {

  private Map<String, String> sortMapping = new HashMap<String, String>();

  public ElasticSearchConnector(InitParams initParams) {
    super(initParams);

    sortMapping.put("date", "title"); // no date field on wikipedia results
    sortMapping.put("relevancy", "_score");
    sortMapping.put("title", "title");
  }

  @Override
  public Collection<SearchResult> search(SearchContext context, String query, Collection<String> sites, int offset, int limit, String sort, String order) {
    Collection<SearchResult> results = new ArrayList<SearchResult>();

    String esQuery = "{\n" +
            "     \"from\" : " + offset + ", \"size\" : " + limit + ",\n" +
            "     \"sort\" : [\n" +
            "       { \"" + sortMapping.get(sort) + "\" : {\"order\" : \"" + order + "\"}}\n" +
            "     ],\n" +
            "     \"query\": {\n" +
            "        \"filtered\" : {\n" +
            "            \"query\" : {\n" +
            "                \"query_string\" : {\n" +
            "                    \"query\" : \"" + query + "\"\n" +
            "                }\n" +
            "            }\n" +
            "        }\n" +
            "     },\n" +
            "     \"highlight\" : {\n" +
            "       \"fields\" : {\n" +
            "         \"text\" : {\"fragment_size\" : 150, \"number_of_fragments\" : 3}\n" +
            "       }\n" +
            "     }\n" +
            "}";

    try {
      HttpClient client = new DefaultHttpClient();
      HttpPost request = new HttpPost("http://localhost:9200/_search");
      StringEntity input = new StringEntity(esQuery);
      request.setEntity(input);

      HttpResponse response = client.execute(request);
      StringWriter writer = new StringWriter();
      IOUtils.copy(response.getEntity().getContent(), writer, "UTF-8");
      String jsonResponse = writer.toString();

      JSONParser parser = new JSONParser();

      Map json = (Map)parser.parse(jsonResponse);
      JSONObject jsonResult = (JSONObject) json.get("hits");
      JSONArray jsonHits = (JSONArray) jsonResult.get("hits");
      for(Object jsonHit : jsonHits) {
        JSONObject hitSource = (JSONObject) ((JSONObject) jsonHit).get("_source");
        String title = (String) hitSource.get("title");
        JSONObject hitHighlights = (JSONObject) ((JSONObject) jsonHit).get("highlight");
        JSONArray hitHighlightsTexts = (JSONArray) hitHighlights.get("text");
        String text = "";
        for(Object hitHighlightsText : hitHighlightsTexts) {
          text += (String) hitHighlightsText + " ... ";
        }

        results.add(new SearchResult(
                "http://wikipedia.org",
                title,
                text,
                "",
                "http://upload.wikimedia.org/wikipedia/commons/thumb/7/77/Wikipedia_svg_logo.svg/45px-Wikipedia_svg_logo.svg.png",
                new Date().getTime(),
                1
        ));
      }

    } catch (Exception e) {
      e.printStackTrace();
    }

    return results;
  }
}

Requests and responses are full JSON. You can find more details about ElasticSearch query syntax in their documentation. The important point here about the search connector is that each result has to be a SearchResult object returned in a collection.

Deploy it in eXo, and enjoy!

We can now deploy our jar (which contains the SearchConnector class and the XML configuration file) in the libs of the application server (/lib of Tomcat for example) and start eXo.

A search using the quick search in the toolbar now retrieves contents from Wikipedia:

02-search-results-preview

When the unified search screen is displayed, we can see that a new Wikipedia filter is listed, and our search results contain some Wikipedia pages:

03-search-results-1

If you don’t want to see Wikipedia contents in your results, simply uncheck the filter:

04-search-results-2

The code source is available here, as a Maven project.

Learn more about this project and what you can do with eXo Platform; join the eXo tribe!

Brahim Jaouane

I am a Digital Marketing specialist specialized in SEO at eXo Platform. Passionate about new technologies and Digital Marketing. With 10 years' experience, I support companies in their digital communication strategies and implement the tools necessary for their success. My approach combines the use of different traffic acquisition levers and an optimization of the user experience to convert visitors into customers. After various digital experiences in communication agencies as well as in B2B company, I have a wide range of skills and I am able to manage the digital marketing strategy of small and medium-sized companies.

Full-featured digital workplace with everything your employees need to work efficiently, smartly integrated for a compelling employee experience

  • Product
    • Software tour
    • Communication
    • Collaboration
    • Knowledge
    • Productivity
    • Open Source
    • Integrations
    • Security
  • Uses cases
    • Digital Workplace
    • Intranet software
    • Collaboration software
    • Knowledge management software
    • Entreprise Social Network
    • Employee Engagement platform
  • Roles
    • Internal Communications
    • Human Resources
    • Information Technology
  • Company
    • Product offer
    • Services Offer
    • Customers
    • Partners
    • About us
  • Resources
    • FAQs
    • Resource Center
    • Collaboration guide
    • What is a Digital workplace?
    • What is an intranet?
    • Employee engagement
  • Terms and Conditions
  • Legal
  • Privacy Policy
  • Accessibility
  • Contact us
  • Sitemap
  • Facebook
  • Twitter
  • LinkedIn
wpDiscuz