com.google.enterprise.adaptor
Interface Adaptor

All Known Implementing Classes:
AbstractAdaptor, AdaptorTemplate, AdaptorWithCrawlTimeMetadataTemplate, CommandLineAdaptor, DbAdaptorTemplate, FileSystemAdaptor, FileSystemAdaptor, GroupDefinitionsFromCsv, GroupDefinitionsScaleTester, GroupDefinitionsWriter

public interface Adaptor

Interface for user-specific implementation details of an Adaptor. Implementations must be thread-safe. Implementations are encouraged to not keep any state or only soft-state like a connection cache.

Once configuration is prepared, init(com.google.enterprise.adaptor.AdaptorContext) will be called. This is guaranteed to occur before any calls to getDocContent(com.google.enterprise.adaptor.Request, com.google.enterprise.adaptor.Response) or getDocIds(com.google.enterprise.adaptor.DocIdPusher). When the adaptor needs to shutdown, destroy() will be called.

If the adaptor is using AbstractAdaptor.main(com.google.enterprise.adaptor.Adaptor, java.lang.String[]), then initConfig(com.google.enterprise.adaptor.Config) will be called before init(com.google.enterprise.adaptor.AdaptorContext) to allow the adaptor an opportunity to set and override default configuration values.

See Also:
AdaptorTemplate, AbstractAdaptor, PollingIncrementalLister

Method Summary
 void destroy()
          Shutdown and release resources of adaptor.
 void getDocContent(Request request, Response response)
          Provides contents and metadata of particular document.
 void getDocIds(DocIdPusher pusher)
          Pushes all the DocIds that are suppose to be indexed by the GSA.
 void init(AdaptorContext context)
          Initialize adaptor with the current context.
 void initConfig(Config config)
          Provides the opportunity for the Adaptor to create new configuration values or override default values.
 

Method Detail

getDocContent

void getDocContent(Request request,
                   Response response)
                   throws IOException,
                          InterruptedException
Provides contents and metadata of particular document. This method should be highly parallelizable and support twenty or more concurrent calls. Two to three concurrent calls may be average during initial GSA crawling, but twenty or more concurrent calls is typical when the GSA is recrawling unmodified content.

If you experience a fatal error, feel free to throw an IOException or RuntimeException. In the case of an error, the GSA will determine if and when to retry.

Throws:
IOException
InterruptedException

getDocIds

void getDocIds(DocIdPusher pusher)
               throws IOException,
                      InterruptedException
Pushes all the DocIds that are suppose to be indexed by the GSA. This will frequently involve re-sending DocIds to the GSA, but this allows healing previous errors and cache inconsistencies. Re-sending DocIds is very fast and should be considered free on the GSA. This method should determine a list of DocIds to push and call DocIdPusher.pushDocIds(java.lang.Iterable) one or more times and DocIdPusher.pushNamedResources(java.util.Map) if using named resources.

pusher is provided as convenience and is the same object provided to init(com.google.enterprise.adaptor.AdaptorContext) previously. This method may take a while and implementations are free to call Thread.sleep(long) occasionally to reduce load.

If you experience a fatal error, feel free to throw an IOException or RuntimeException. In the case of an error, the ExceptionHandler in use in AdaptorContext will determine if and when to retry.

Throws:
IOException
InterruptedException

initConfig

void initConfig(Config config)
Provides the opportunity for the Adaptor to create new configuration values or override default values. Only Config.addKey(java.lang.String, java.lang.String) should likely be called. The user's configuration will override any values set in this way. This method is called by AbstractAdaptor.main(com.google.enterprise.adaptor.Adaptor, java.lang.String[]) before init(com.google.enterprise.adaptor.AdaptorContext) is called.


init

void init(AdaptorContext context)
          throws Exception
Initialize adaptor with the current context. This is the ideal time to start any threads to do extra behind-the-scenes work. The context points to other useful objects that can be used at any time. For example, methods on DocIdPusher provided via AdaptorContext.getDocIdPusher() are allowed to be called whenever the Adaptor wishes. This allows doing event-based incremental pushes at any time.

The method is called at the end of GsaCommunicationHandler.start().

If you experience a fatal error during initialization, feel free to throw an Exception to cancel the startup process.

Throws:
Exception

destroy

void destroy()
Shutdown and release resources of adaptor.