Kneobase is an enterprise search engine, based upon the Lucene search engine and the Spring framework. It allows to perform full-text search across many different content sources. It is highly adaptable out-of-the-box and has a pluggable architecture.
lib/ PDFBox-0.7.0.jar Tidy.jar metadata-extractor-2.2.2.jar poi-2.5.1-final-20040804.jar poi-scratchpad-2.5.1-final-20040804.jar tagstripper.jar tm-extractors-0.4.jar lib-no-deploy/ nekohtml-0.9.2.jar nekohtmlXni-0.9.2.jar xercesImpl.jar xml-apis.jar src/ com/ kneobase/ extractors/ parser/ A_StringParser.java ExcelPOIParser.java ExcelParser.java HtmlJTidyParser.java HtmlParser.java OpenOfficeParser.java PdfBoxParser.java PdfParser.java PlainParser.java PptPOIParser.java PptParser.java RtfParser.java TaggedParser.java WordPOIParser.java WordParser.java WordTextMiningParser.java XmlParser.java XmlSAXParser.java ExcelBuilder.java HtmlBuilder.java OpenOfficeBuilder.java PdfBuilder.java PlainBuilder.java PptBuilder.java RtfBuilder.java TaggedBuilder.java WordBuilder.java XmlBuilder.java src-not-included/ com/ kneobase/ extractors/ parser/ HtmlNekoParser.java .classpath .project build.xml log4j.properties project.properties