See: Description
Package | Description |
---|---|
com.google.enterprise.adaptor |
Adaptor interfaces and implementation.
|
com.google.enterprise.adaptor.examples | |
com.google.enterprise.adaptor.experimental | |
com.google.enterprise.adaptor.prebuilt | |
com.google.enterprise.adaptor.testing |
Easily provide repository data to a Google Search Appliance (GSA).
Note: If instead of Java you'd like to use another language take a look
at CommandLineAdaptor
.
In the GSA's Admin Console, go to Content Sources > Feeds, and scroll down to List of Trusted IP Addresses. Add the IP address for the adaptor to the list.
In the Admin console, go to Content Sources > Web Crawl > Start and
Block URLs , and scroll down to Follow Patterns. Add an entry like
hostname:port/
where hostname
is the hostname of the machine
that hosts the adaptor and port
defaults to 5678 (read on to
change port number).
adaptor-20130612-withlib.jar
) and extracted adaptor
examples jar (eg: examples/adaptor-20130612-examples.jar
).
If instead of working from a release you are
working from source code you can build the required jars by running:
ant dist cd dist
The needed jars will be in a zip file within the current directory (eg: adaptor-20130612-bin.zip will have adaptor-20130612-withlib.jar and examples/adaptor-20130612-examples.jar).
adaptor-config.properties
text file in the
current directory that looks like:
gsa.hostname=mygsahostname
You should replace mygsahostname
with the hostname or IP
of your GSA. This file allows you to do other configuration of the adaptor
library like changing the server port and feed name:
gsa.hostname=mygsahostname server.port=6677 feed.name=mydocfeedtogsa
Later, if you have trouble with the adaptor library incorrectly auto-detecting your computer's hostname, then you may need to add a line like:
server.hostname=yourcomputershostname
For a list and explanation of available configruation options view
Config
.
java -cp adaptor-20130612-withlib.jar;examples/adaptor-20130612-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplateFor all other OSes:
java -cp adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate
Go to Content Sources > Diagnostics > Crawl Status and click Resume Crawl if crawling system is currently paused.
In the GSA, go to Contents Sources > Feeds.
In the Current Feeds section, you should see an entry for a
"adaptor_HOSTNAME_PORT" (which can be changed by setting the
feed.name
configuration variable).
In the adaptor log look to see document ids being pushed and requests for document contents being served.
Adaptor
and AbstractAdaptor
.
adaptor-20130612-src.zip
),
make a copy of src/com/google/enterprise/adaptor/examples/AdaptorTemplate.java
to your own package and name. You will need to modify the contents
appropriately for the new package and name.
adaptor-20130612-withlib.jar
in your classpath.
Note that the date may be different.
An adaptor, by default, will deny all document accesses, except from the
GSA. To allow debugging and testing an adaptor without a GSA, you can add a
hostname to the server.fullAccessHosts
config key to allow that
computer full access to all adaptor content. In addition, this setting
allows that computer to see metadata and other GSA-specific information as
HTTP headers. This can be very useful when combined with Firebug or the Web
Inspector in your browser to observe an Adaptor's behavior.
You can set configuration variables on the command line instead of in
adaptor-config.properties
. You are allowed multiple arguments
of the form "-Dconfigkey=configvalue". When providing a value on the command
line, it overrides the default value and the value (if any) in the
configuration file. For example:
java -cp adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate -Dgsa.hostname=mygsahostname -Dserver.port=6677
Download and extract prunsrv.exe from the latest Windows binary download of Apache Commons Daemon. If you are running on 64-bit Windows and will use a 64-bit JVM, then you should use the prunsrv.exe in the amd64/ directory. Place prunsrv.exe in the same directory of the Adaptor you would like to run as a service.
You can then register the service:
prunsrv install someadaptor --StartPath="%CD%" ^ --Classpath=someadaptor-withlib.jar ^ --StartMode=jvm --StartClass=com.google.enterprise.adaptor.Daemon ^ --StartMethod=serviceStart --StartParams=package.SomeAdaptor --StopMode=jvm --StopClass=com.google.enterprise.adaptor.Daemon ^ --StopMethod=serviceStop --StdOutput=stdout.log --StdError=stderr.log ^ ++JvmOptions=-Djava.util.logging.config.file=logging.properties ^ --Startup=auto
Where someadaptor
is a unique, arbitrary service name.
To start the service, use the Windows service management tool or run:
prunsrv start someadaptor
Where someadaptor
is the same service name used during
registration.
Security is not enabled by default because it requires a reasonable amount
of setup, on both the GSA and adaptor. The GSA needs a valid certificate for
the hostname you are accessing it with (gsa.hostname
). Thus,
the default one it ships with cannot be valid and you need to generate a new
one. Setting up security is required before users can access non-public
documents directly from the adaptor.
In the GSA's Admin Console, go to Administration > SSL Settings. Under the Create a New SSL Certificate heading change Host Name to GSA's hostname written exactly as the adaptor will use. Then click Create Self-Signed Certificate and wait for the operation to complete. Then click Install SSL Certificate and wait for that operation to complete (about 1 minute). You now have a valid self-signed certificate, but it is not available to be trusted by the adaptor.
You need to get the GSA's freshly-created certificate to add it as a trusted host for the adaptor:
openssl s_client -connect gsahostname:443 < /dev/nullCopy the section that begins with
-----BEGIN CERTIFICATE-----
and ends with -----END CERTIFICATE-----
(including the BEGIN
and END CERTIFICATE portions) into a new file. Save the file in your
adaptor's directory with the name "gsa.crt".
Now you should generate a self-signed certificate for the adaptor and export the newly created certificate. Within the adaptor's directory, you should run:
keytool -genkeypair -keystore keys.jks -storepass changeit -keypass changeit -alias adaptor -keyalg RSA -validity 365
For "What is your first and last name?", you should enter the hostname of the adaptor's computer. You are free to answer the other questions however you wish (including not answering them). When you are happy with your answers, answer "yes" to "Is CN=yourcomputershostname, OU=... correct?"
Then, still in adaptor's directory, you should run:
keytool -exportcert -alias adaptor -keystore keys.jks -storepass changeit -keypass changeit -rfc -file adaptor.crt
Copy cacerts from Java to the adaptor's directory. For Windows:
copy PATH\TO\JRE\lib\security\cacerts cacerts.jks
For all other OSes:
cp PATH/TO/JRE/lib/security/cacerts cacerts.jks
To allow the adaptor to trust itself, execute:
keytool -importcert -keystore cacerts.jks -storepass changeit -file adaptor.crt -alias adaptor
Answer "yes" to "Trust this certificate?"
To allow the adaptor to trust the GSA, execute:
keytool -importcert -keystore cacerts.jks -storepass changeit -file gsa.crt -alias gsa
Answer "yes" to "Trust this certificate?"
To allow the GSA to trust the adaptor, within the GSA's Admin Console, go to Administration > Certificate Authorities. Click the Choose File button (this button could be called "Browse...") under the Add more Cerificate Authorities heading. Choose "adaptor.crt" in the adaptor's directory and click Save Settings.
Now that everything is prepared, you can flip the security switch with the
adaptor by adding a line to your adaptor-config.properties
:
server.secure=true
The adaptor can now use the GSA's authentication configuration and will use HTTPS for all communication.
Example command line to run secure:
java \ -Djava.util.logging.config.file=logging.properties \ -Djavax.net.ssl.keyStore=keys.jks \ -Djavax.net.ssl.keyStoreType=jks \ -Djavax.net.ssl.keyStorePassword=changeit \ -Djavax.net.ssl.trustStore=cacerts.jks \ -Djavax.net.ssl.trustStoreType=jks \ -Djavax.net.ssl.trustStorePassword=changeit \ -classpath 'adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar' \ com.google.enterprise.adaptor.examples.AdaptorWithCrawlTimeMetadataTemplate
There are additional security options you can control on the GSA. You may want to try running an adaptor with server.secure set before enabling these stricter features. Within the GSA's Admin Console, go to Administration > SSL Settings. There you can:
Click Save Setup to save your changes.
Note: By using these settings you improve security, but also require
all adaptors to be configured for security and have
server.secure=true
in their configuration.