• Lucene全文检索


    Lucene全文检索

    Lucene 是一个基于 Java 的全文信息检索工具包,目前主流的搜索系统 Elasticsearch 和 solr 都是基于 lucene 的索引和搜索能力进行。

    Solr与Lucene的区别:

    Solr和Lucene的本质区别三点:搜索服务器,企业级和管理。

    Lucene本质上是搜索库,不是独立的应用程序,而Solr是。

    Lucene专注于搜索底层的建设,而Solr专注与企业应用。

    Lucene不负责支撑搜索服务所必须的管理,而Slor负责

    所以说,一句话概括solr是Lucene面向企业搜索应用的扩展,如果Lucene数据量超过10万就会有点力不从心了

    ES:
    ES是对apache lucene的封装。
    ES是elasticSearch的缩写,它是一个实时的分布式的查询和分析引擎。它是基于apache lucene开发的。
    2:ES的目标是让全文搜索变得简单
    3:ES可以支持横向的扩展,支持pb级别的结构和非机构化的数据处理。
    4:使用ES可以以前所未有的速度来处理大数据。

    今天的主题是Lucene:

    Lucene生成索引:

    package com.zking.test.lucene;
    
    /**
     * 生成索引测试
     * @author Administrator
     *
     */
    public class Demo1 {
    	public static void main(String[] args) {
    //		索引文件将要存放的位置
    		String indexDir = "E:\\temp\\test\\lucene\\demo1";
    //		数据源地址
    		String dataDir = "E:\\temp\\test\\lucene\\demo1\\data";
    		IndexCreate ic = null; 
    		try {
    			ic = new IndexCreate(indexDir);
    			long start = System.currentTimeMillis();
    			int num = ic.index(dataDir);
    			long end = System.currentTimeMillis();
    			System.out.println("检索指定路径下"+num+"个文件,一共花费了"+(end-start)+"毫秒");
    		} catch (Exception e) {
    			e.printStackTrace();
    		}finally {
    			try {
    				ic.closeIndexWriter();
    			} catch (Exception e) {
    				e.printStackTrace();
    			}
    		}
    	}
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32

    配合Demo1的实现:

    package com.zking.test.lucene;
    
    import java.io.File;
    import java.io.FileReader;
    import java.nio.file.Paths;
    
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.document.TextField;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.index.IndexWriterConfig;
    import org.apache.lucene.store.FSDirectory;
    
    /**
     * 配合Demo1.java进行lucene的helloword实现
     * @author Administrator
     *
     */
    public class IndexCreate {
    	private IndexWriter indexWriter;
    	
    	/**
    	 * 1、构造方法 实例化IndexWriter
    	 * @param indexDir 索引文件存放的地址
    	 * @throws Exception
    	 */
    	public IndexCreate(String indexDir) throws Exception{
    //		获取索引文件的存放地址对象
    		FSDirectory dir = FSDirectory.open(Paths.get(indexDir));
    //		标准分词器(针对英文)
    		Analyzer analyzer = new StandardAnalyzer();
    //		索引输出流配置对象
    		IndexWriterConfig conf = new IndexWriterConfig(analyzer); 
    		indexWriter = new IndexWriter(dir, conf);
    	}
    	
    	/**
    	 * 2、关闭索引输出流
    	 * @throws Exception
    	 */
    	public void closeIndexWriter()  throws Exception{
    		indexWriter.close();
    	}
    	
    	/**
    	 * 3、索引指定路径下的所有文件
    	 * @param dataDir 数据源
    	 * @return
    	 * @throws Exception
    	 */
    	public int index(String dataDir) throws Exception{
    		File[] files = new File(dataDir).listFiles();
    		for (File file : files) {
    			indexFile(file);
    		}
    		return indexWriter.numDocs();
    	}
    	
    	/**
    	 * 4、索引指定的文件
    	 * @param file
    	 * @throws Exception
    	 */
    	private void indexFile(File file) throws Exception{
    		System.out.println("被索引文件的全路径:"+file.getCanonicalPath());
    		Document doc = getDocument(file);
    		indexWriter.addDocument(doc);
    	}
    	
    	/**
    	 * 5、获取文档(索引文件中包含的重要信息,key-value的形式)
    	 * @param file
    	 * @return
    	 * @throws Exception
    	 */
    	private Document getDocument(File file) throws Exception{
    		Document doc = new Document();
    		doc.add(new TextField("contents", new FileReader(file)));
    //		Field.Store.YES是否存储到硬盘
    		doc.add(new TextField("fullPath", file.getCanonicalPath(),Field.Store.YES));
    		doc.add(new TextField("fileName", file.getName(),Field.Store.YES));
    		return doc;
    	}
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87

    Lucene查询索引:

    package com.zking.test.lucene;
    
    /**
     * 查询索引测试
     * @author Administrator
     *
     */
    public class Demo2 {
    	public static void main(String[] args) {
    		String indexDir = "E:\\temp\\test\\lucene\\demo1";
    		String q = "EarlyTerminating-Collector";
    		try {
    			IndexUse.search(indexDir, q);
    		} catch (Exception e) {
    			e.printStackTrace();
    		}
    	}
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19

    配合Demo2的实现:

    package com.zking.test.lucene;
    
    import java.nio.file.Paths;
    
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.index.DirectoryReader;
    import org.apache.lucene.index.IndexReader;
    import org.apache.lucene.queryparser.classic.QueryParser;
    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.search.Query;
    import org.apache.lucene.search.ScoreDoc;
    import org.apache.lucene.search.TopDocs;
    import org.apache.lucene.store.FSDirectory;
    
    /**
     * 配合Demo2.java进行lucene的helloword实现
     * @author Administrator
     *
     */
    public class IndexUse {
    	/**
    	 * 通过关键字在索引目录中查询
    	 * @param indexDir	索引文件所在目录
    	 * @param q	关键字
    	 */
    	public static void search(String indexDir, String q) throws Exception{
    		FSDirectory indexDirectory = FSDirectory.open(Paths.get(indexDir));
    //		注意:索引输入流不是new出来的,是通过目录读取工具类打开的
    		IndexReader indexReader = DirectoryReader.open(indexDirectory);
    //		获取索引搜索对象
    		IndexSearcher indexSearcher = new IndexSearcher(indexReader);
    		Analyzer analyzer = new StandardAnalyzer();
    		QueryParser queryParser = new QueryParser("contents", analyzer);
    //		获取符合关键字的查询对象
    		Query query = queryParser.parse(q);
    		
    		long start=System.currentTimeMillis();
    //		获取关键字出现的前十次
    		TopDocs topDocs = indexSearcher.search(query , 10);
    		long end=System.currentTimeMillis();
    		System.out.println("匹配 "+q+" ,总共花费"+(end-start)+"毫秒"+"查询到"+topDocs.totalHits+"个记录");
    		
    		for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
    			int docID = scoreDoc.doc;
    //			索引搜索对象通过文档下标获取文档
    			Document doc = indexSearcher.doc(docID);
    			System.out.println("通过索引文件:"+doc.get("fullPath")+"拿数据");
    		}
    		
    		indexReader.close();
    	}
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55

    对索引的增删改:

    package com.zking.test.lucene;
    
    import java.nio.file.Paths;
    
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.document.StringField;
    import org.apache.lucene.document.TextField;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.index.IndexWriterConfig;
    import org.apache.lucene.index.Term;
    import org.apache.lucene.store.FSDirectory;
    import org.junit.Before;
    import org.junit.Test;
    
    /**
     * 构建索引
     * 	对索引的增删改
     * @author Administrator
     *
     */
    public class Demo3 {
    	private String ids[]={"1","2","3"};
    	private String citys[]={"qingdao","nanjing","shanghai"};
    	private String descs[]={
    			"Qingdao is a beautiful city.",
    			"Nanjing is a city of culture.",
    			"Shanghai is a bustling city."
    	};
    	private FSDirectory dir;
    	
    	/**
    	 * 每次都生成索引文件
    	 * @throws Exception
    	 */
    	@Before
    	public void setUp() throws Exception {
    		dir  = FSDirectory.open(Paths.get("E:\\temp\\test\\lucene\\demo2\\indexDir"));
    		IndexWriter indexWriter = getIndexWriter();
    		for (int i = 0; i < ids.length; i++) {
    			Document doc = new Document();
    			doc.add(new StringField("id", ids[i], Field.Store.YES));
    			doc.add(new StringField("city", citys[i], Field.Store.YES));
    			doc.add(new TextField("desc", descs[i], Field.Store.NO));
    			indexWriter.addDocument(doc);
    		}
    		indexWriter.close();
    	}
    
    	/**
    	 * 获取索引输出流
    	 * @return
    	 * @throws Exception
    	 */
    	private IndexWriter getIndexWriter()  throws Exception{
    		Analyzer analyzer = new StandardAnalyzer();
    		IndexWriterConfig conf = new IndexWriterConfig(analyzer);
    		return new IndexWriter(dir, conf );
    	}
    	
    	/**
    	 * 测试写了几个索引文件
    	 * @throws Exception
    	 */
    	@Test
    	public void getWriteDocNum() throws Exception {
    		IndexWriter indexWriter = getIndexWriter();
    		System.out.println("索引目录下生成"+indexWriter.numDocs()+"个索引文件");
    	}
    	
    	/**
    	 * 打上标记,该索引实际并未删除
    	 * @throws Exception
    	 */
    	@Test
    	public void deleteDocBeforeMerge() throws Exception {
    		IndexWriter indexWriter = getIndexWriter();
    		System.out.println("最大文档数:"+indexWriter.maxDoc());
    		indexWriter.deleteDocuments(new Term("id", "1"));
    		indexWriter.commit();
    		
    		System.out.println("最大文档数:"+indexWriter.maxDoc());
    		System.out.println("实际文档数:"+indexWriter.numDocs());
    		indexWriter.close();
    	}
    	
    	/**
    	 * 对应索引文件已经删除,但是该版本的分词会保留
    	 * @throws Exception
    	 */
    	@Test
    	public void deleteDocAfterMerge() throws Exception {
    //		https://blog.csdn.net/asdfsadfasdfsa/article/details/78820030
    //		org.apache.lucene.store.LockObtainFailedException: Lock held by this virtual machine:indexWriter是单例的、线程安全的,不允许打开多个。
    		IndexWriter indexWriter = getIndexWriter();
    		System.out.println("最大文档数:"+indexWriter.maxDoc());
    		indexWriter.deleteDocuments(new Term("id", "1"));
    		indexWriter.forceMergeDeletes(); //强制删除
    		indexWriter.commit();
    		
    		System.out.println("最大文档数:"+indexWriter.maxDoc());
    		System.out.println("实际文档数:"+indexWriter.numDocs());
    		indexWriter.close();
    	}
    	
    	/**
    	 * 测试更新索引
    	 * @throws Exception
    	 */
    	@Test
    	public void testUpdate()throws Exception{
    		IndexWriter writer=getIndexWriter();
    		Document doc=new Document();
    		doc.add(new StringField("id", "1", Field.Store.YES));
    		doc.add(new StringField("city","qingdao",Field.Store.YES));
    		doc.add(new TextField("desc", "dsss is a city.", Field.Store.NO));
    		writer.updateDocument(new Term("id","1"), doc);
    		writer.close();
    	}
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123

    文档域加权:

    package com.zking.test.lucene;
    
    import java.nio.file.Paths;
    
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.document.StringField;
    import org.apache.lucene.document.TextField;
    import org.apache.lucene.index.DirectoryReader;
    import org.apache.lucene.index.IndexReader;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.index.IndexWriterConfig;
    import org.apache.lucene.index.Term;
    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.search.Query;
    import org.apache.lucene.search.ScoreDoc;
    import org.apache.lucene.search.TermQuery;
    import org.apache.lucene.search.TopDocs;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.FSDirectory;
    import org.junit.Before;
    import org.junit.Test;
    
    /**
     * 文档域加权
     * @author Administrator
     *
     */
    public class Demo4 {
    	private String ids[]={"1","2","3","4"};
    	private String authors[]={"Jack","Marry","John","Json"};
    	private String positions[]={"accounting","technician","salesperson","boss"};
    	private String titles[]={"Java is a good language.","Java is a cross platform language","Java powerful","You should learn java"};
    	private String contents[]={
    			"If possible, use the same JRE major version at both index and search time.",
    			"When upgrading to a different JRE major version, consider re-indexing. ",
    			"Different JRE major versions may implement different versions of Unicode,",
    			"For example: with Java 1.4, `LetterTokenizer` will split around the character U+02C6,"
    	};
    	
    	private Directory dir;//索引文件目录
    
    	@Before
    	public void setUp()throws Exception {
    		dir = FSDirectory.open(Paths.get("E:\\temp\\test\\lucene\\demo3\\indexDir"));
    		IndexWriter writer = getIndexWriter();
    		for (int i = 0; i < authors.length; i++) {
    			Document doc = new Document();
    			doc.add(new StringField("id", ids[i], Field.Store.YES));
    			doc.add(new StringField("author", authors[i], Field.Store.YES));
    			doc.add(new StringField("position", positions[i], Field.Store.YES));
    			
    			TextField textField = new TextField("title", titles[i], Field.Store.YES);
    			
    //			Json投钱做广告,把排名刷到第一了
    			if("boss".equals(positions[i])) {
    				textField.setBoost(2f);//设置权重,默认为1
    			}
    			
    			doc.add(textField);
    //			TextField会分词,StringField不会分词
    			doc.add(new TextField("content", contents[i], Field.Store.NO));
    			writer.addDocument(doc);
    		}
    		writer.close();
    		
    	}
    
    	private IndexWriter getIndexWriter() throws Exception{
    		Analyzer analyzer = new StandardAnalyzer();
    		IndexWriterConfig conf = new IndexWriterConfig(analyzer);
    		return new IndexWriter(dir, conf);
    	}
    	
    	@Test
    	public void index() throws Exception{
    		IndexReader reader = DirectoryReader.open(dir);
    		IndexSearcher searcher = new IndexSearcher(reader);
    		String fieldName = "title";
    		String keyWord = "java";
    		Term t = new Term(fieldName, keyWord);
    		Query query = new TermQuery(t);
    		TopDocs hits = searcher.search(query, 10);
    		System.out.println("关键字:‘"+keyWord+"’命中了"+hits.totalHits+"次");
    		for (ScoreDoc scoreDoc : hits.scoreDocs) {
    			Document doc = searcher.doc(scoreDoc.doc);
    			System.out.println(doc.get("author"));
    		}
    	}
    	
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94

    特定项搜索和查询表达式(queryParser):

    package com.zking.test.lucene;
    
    import java.io.IOException;
    import java.nio.file.Paths;
    
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.index.DirectoryReader;
    import org.apache.lucene.index.IndexReader;
    import org.apache.lucene.index.Term;
    import org.apache.lucene.queryparser.classic.ParseException;
    import org.apache.lucene.queryparser.classic.QueryParser;
    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.search.NumericRangeQuery;
    import org.apache.lucene.search.ScoreDoc;
    import org.apache.lucene.search.TermQuery;
    import org.apache.lucene.search.TopDocs;
    import org.apache.lucene.store.FSDirectory;
    import org.junit.Before;
    import org.junit.Test;
    
    /**
     * 特定项搜索
     * 查询表达式(queryParser)
     * @author Administrator
     *
     */
    public class Demo5 {
    	@Before
    	public void setUp() {
    		// 索引文件将要存放的位置
    		String indexDir = "E:\\temp\\test\\lucene\\demo4";
    		// 数据源地址
    		String dataDir = "E:\\temp\\test\\lucene\\demo4\\data";
    		IndexCreate ic = null;
    		try {
    			ic = new IndexCreate(indexDir);
    			long start = System.currentTimeMillis();
    			int num = ic.index(dataDir);
    			long end = System.currentTimeMillis();
    			System.out.println("检索指定路径下" + num + "个文件,一共花费了" + (end - start) + "毫秒");
    			
    			
    		} catch (Exception e) {
    			e.printStackTrace();
    		} finally {
    			try {
    				ic.closeIndexWriter();
    			} catch (Exception e) {
    				e.printStackTrace();
    			}
    		}
    	}
    	
    	/**
    	 * 特定项搜索
    	 */
    	@Test
    	public void testTermQuery() {
    		String indexDir = "E:\\temp\\test\\lucene\\demo4";
    		
    		String fld = "contents";
    		String text = "indexformattoooldexception";
    //		特定项片段名和关键字
    		Term t  = new Term(fld , text);
    		TermQuery tq = new TermQuery(t  );
    		try {
    			FSDirectory indexDirectory = FSDirectory.open(Paths.get(indexDir));
    //			注意:索引输入流不是new出来的,是通过目录读取工具类打开的
    			IndexReader indexReader = DirectoryReader.open(indexDirectory);
    //			获取索引搜索对象
    			IndexSearcher is = new IndexSearcher(indexReader);
    			
    			
    			TopDocs hits = is.search(tq, 100);
    //			System.out.println(hits.totalHits);
    			for(ScoreDoc scoreDoc: hits.scoreDocs) {
    				Document doc = is.doc(scoreDoc.doc);
    				System.out.println("文件"+doc.get("fullPath")+"中含有该关键字");
    				
    			}
    		} catch (IOException e) {
    			e.printStackTrace();
    		}
    	}
    	
    	/**
    	 * 查询表达式(queryParser)
    	 */
    	@Test
    	public void testQueryParser() {
    		String indexDir = "E:\\temp\\test\\lucene\\demo4";
    //		获取查询解析器(通过哪种分词器去解析哪种片段)
    		QueryParser queryParser = new QueryParser("contents", new StandardAnalyzer());
    		try {
    			FSDirectory indexDirectory = FSDirectory.open(Paths.get(indexDir));
    //			注意:索引输入流不是new出来的,是通过目录读取工具类打开的
    			IndexReader indexReader = DirectoryReader.open(indexDirectory);
    //			获取索引搜索对象
    			IndexSearcher is = new IndexSearcher(indexReader);
    			
    //			由解析器去解析对应的关键字
    			TopDocs hits = is.search(queryParser.parse("indexformattoooldexception") , 100);
    			for(ScoreDoc scoreDoc: hits.scoreDocs) {
    				Document doc = is.doc(scoreDoc.doc);
    				System.out.println("文件"+doc.get("fullPath")+"中含有该关键字");
    				
    			}
    		} catch (IOException e) {
    			e.printStackTrace();
    		} catch (ParseException e) {
    			// TODO Auto-generated catch block
    			e.printStackTrace();
    		}
    	}
    	
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118

    指定数字范围查询和指定字符串开头字母查询(prefixQuery):

    package com.zking.test.lucene;
    
    import java.nio.file.Paths;
    
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.document.IntField;
    import org.apache.lucene.document.StringField;
    import org.apache.lucene.document.TextField;
    import org.apache.lucene.index.DirectoryReader;
    import org.apache.lucene.index.IndexReader;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.index.IndexWriterConfig;
    import org.apache.lucene.index.Term;
    import org.apache.lucene.search.BooleanClause;
    import org.apache.lucene.search.BooleanQuery;
    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.search.NumericRangeQuery;
    import org.apache.lucene.search.PrefixQuery;
    import org.apache.lucene.search.ScoreDoc;
    import org.apache.lucene.search.TopDocs;
    import org.apache.lucene.store.FSDirectory;
    import org.junit.Before;
    import org.junit.Test;
    
    /**
     * 指定数字范围查询
     * 指定字符串开头字母查询(prefixQuery)
     * @author Administrator
     *
     */
    public class Demo6 {
    	private int ids[]={1,2,3};
    	private String citys[]={"qingdao","nanjing","shanghai"};
    	private String descs[]={
    			"Qingdao is a beautiful city.",
    			"Nanjing is a city of culture.",
    			"Shanghai is a bustling city."
    	};
    	private FSDirectory dir;
    	
    	/**
    	 * 每次都生成索引文件
    	 * @throws Exception
    	 */
    	@Before
    	public void setUp() throws Exception {
    		dir  = FSDirectory.open(Paths.get("E:\\temp\\test\\lucene\\demo2\\indexDir"));
    		IndexWriter indexWriter = getIndexWriter();
    		for (int i = 0; i < ids.length; i++) {
    			Document doc = new Document();
    			doc.add(new IntField("id", ids[i], Field.Store.YES));
    			doc.add(new StringField("city", citys[i], Field.Store.YES));
    			doc.add(new TextField("desc", descs[i], Field.Store.NO));
    			indexWriter.addDocument(doc);
    		}
    		indexWriter.close();
    	}
    	
    	/**
    	 * 获取索引输出流
    	 * @return
    	 * @throws Exception
    	 */
    	private IndexWriter getIndexWriter()  throws Exception{
    		Analyzer analyzer = new StandardAnalyzer();
    		IndexWriterConfig conf = new IndexWriterConfig(analyzer);
    		return new IndexWriter(dir, conf );
    	}
    	
    	/**
    	 * 指定数字范围查询
    	 * @throws Exception
    	 */
    	@Test
    	public void testNumericRangeQuery()throws Exception{
    		IndexReader reader = DirectoryReader.open(dir);
    		IndexSearcher is = new IndexSearcher(reader);
    		
    		NumericRangeQuery<Integer> query=NumericRangeQuery.newIntRange("id", 1, 2, true, true);
    		TopDocs hits=is.search(query, 10);
    		for(ScoreDoc scoreDoc:hits.scoreDocs){
    			Document doc=is.doc(scoreDoc.doc);
    			System.out.println(doc.get("id"));
    			System.out.println(doc.get("city"));
    			System.out.println(doc.get("desc"));
    		}		
    	}
    	
    	/**
    	 * 指定字符串开头字母查询(prefixQuery)
    	 * @throws Exception
    	 */
    	@Test
    	public void testPrefixQuery()throws Exception{
    		IndexReader reader = DirectoryReader.open(dir);
    		IndexSearcher is = new IndexSearcher(reader);
    		
    		PrefixQuery query=new PrefixQuery(new Term("city","n"));
    		TopDocs hits=is.search(query, 10);
    		for(ScoreDoc scoreDoc:hits.scoreDocs){
    			Document doc=is.doc(scoreDoc.doc);
    			System.out.println(doc.get("id"));
    			System.out.println(doc.get("city"));
    			System.out.println(doc.get("desc"));
    		}	
    	}
    	
    	@Test
    	public void testBooleanQuery()throws Exception{
    		IndexReader reader = DirectoryReader.open(dir);
    		IndexSearcher is = new IndexSearcher(reader);
    		
    		NumericRangeQuery<Integer> query1=NumericRangeQuery.newIntRange("id", 1, 2, true, true);
    		PrefixQuery query2=new PrefixQuery(new Term("city","s"));
    		BooleanQuery.Builder booleanQuery=new BooleanQuery.Builder();
    		booleanQuery.add(query1,BooleanClause.Occur.MUST);
    		booleanQuery.add(query2,BooleanClause.Occur.MUST);
    		TopDocs hits=is.search(booleanQuery.build(), 10);
    		for(ScoreDoc scoreDoc:hits.scoreDocs){
    			Document doc=is.doc(scoreDoc.doc);
    			System.out.println(doc.get("id"));
    			System.out.println(doc.get("city"));
    			System.out.println(doc.get("desc"));
    		}	
    	}
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
  • 相关阅读:
    pycharm统计代码运行时间
    Linux内核 -- ARM指定CPU运行逻辑之smp_call_function_single函数
    冥想21天总结
    lightdb22.3-oracle 内置包兼容增强
    自定义类型详解(保姆级教程)
    Session实现登录(springboot项目)
    html5使用Websocket
    我的创作纪念日
    【SpringBoot笔记06】SpringBoot集成log4j2日志框架
    教学反思万能模板
  • 原文地址:https://blog.csdn.net/weixin_63719049/article/details/126431175