目录
字符串(String)是用一对双引号括起来的零个或多个字符组成的有限序列,在java中,字符串是被作为对象来处理的。常用的字符串可以分为以下两大类:
毫无疑问,字符串是计算机处理信息的基础,毕竟我们向计算机输入的信息本质上都是字符串,而计算机所谓的“打印”实际上也是以字符串的形式显示出来的。所以,要想学好计算机,学会并熟练使用字符串以及字符串的各种操作(比如拼接、比较、截取、查找和替换等)是必不可少的。下面,我们就来进行一些基础的字符串操作,话不多说,上代码!
其实字符串也是一种线性表,它和之前的栈、队列一样,有顺序存储结构、链式存储结构这两种存储方式,今天我们采用的是顺序存储结构,即以顺序表(数组)为底层进行模拟。
相信经过这几天的学习,大家对类的创建已经差不多熟练了,这里就不过多阐述了。
- public class MyString {
- /**
- * The maximal length.
- */
- public static final int MAX_LENGTH = 10;
-
- /**
- * The actual length.
- */
- int length;
-
- /**
- * The data.
- */
- char[] data;
-
- /**
- *********************
- * Construct an empty char array.
- *********************
- */
- public MyString() {
- length = 0;
- data = new char[MAX_LENGTH];
- }// Of the first constructor
-
- /**
- *********************
- * Construct using a system defined string.
- *
- * @param paraString
- * The given string. Its length should not exceed MAX_LENGTH - 1.
- *********************
- */
- public MyString(String paraString) {
- data = new char[MAX_LENGTH];
- length = paraString.length();
-
- // Copy data.
- for (int i = 0; i < length; i++) {
- data[i] = paraString.charAt(i);
- } // Of for i
- } // Of the second constructor
注意,为了后续操作方便,这里我们使用了charAt()方法+for循环,从而将字符串paraString复制到之前定义的数组data中。
- /**
- *********************
- * Overrides the method claimed in Object, the superclass of any class.
- *********************
- */
- public String toString() {
- String resultString = "";
-
- for (int i = 0; i < length; i++) {
- resultString += data[i];
- } // Of for i
-
- return resultString;
- } // Of toString
同样的,我们重写toString()方法来实现字符串遍历。
进行代码模拟之前,我们先来了解一下什么是字符串匹配。所谓“字符串匹配”,就是给定一个模式字符串(相当于原字符串的子串),判断其是否存在于主字符串(即原字符串)中,如果存在,确定该模式串出现的位置。显然,模式串的长度(patternLength) ≤ 主串的长度(mainLength)。
具体操作的方法为:将模式串的元素逐个与主串一一比对,直到在主串中找到某一段连续的部分与模式串完全匹配,此时模式串中第一个元素的下标即为该模式串出现的位置。
为了更好地理解字符串匹配的过程,我们用图来说明:
每次匹配,都是将模式串中的元素逐个(即从头到尾)与主串中的对应元素进行比对,若全部相同则匹配成功;若匹配过程中出现模式串的某个元素与对应元素不一样则立即停止,直接移动模式串进行下一次匹配。此外,由上图可以看出,最多需要进行mainLength - patternLength + 1次匹配。
理解字符串匹配的原理后,我们就可以开始代码实现了,如下:
- /**
- *********************
- * Locate the position of a substring.
- *
- * @param paraString The given substring.
- * @return The first position. -1 for no matching.
- *********************
- */
- public int locate(MyString paraMyString) {
- boolean tempMatch = false;
- for (int i = 0; i < length - paraMyString.length + 1; i++) {
- // Initialize.
- tempMatch = true;
- for (int j = 0; j < paraMyString.length; j++) {
- if (data[i + j] != paraMyString.data[j]) {
- tempMatch = false;
- break;
- } // Of if
- } // Of for j
-
- if (tempMatch) {
- return i;
- } // Of if
- } // Of for i
- return -1;
- } // Of locate
这里我们用到的是两层for循环,外层for-i循环是控制匹配次数的循环,因为最多匹配mainLength - patternLength + 1次且i的初始值为0,所以循环控制条件即为i < mainLength - patternLength + 1;而内层for-j循环则控制每次匹配时对模式串元素从头到尾进行比对,相当于对模式串进行一次遍历操作。注意,每次匹配时一旦出现比对失败就立即停止,马上进行下一次匹配,所以我们这里增加了一个if判断语句+break语句。
如果某次匹配全部比对成功,则执行完for-j循环后tempMatch仍为true,所以直接返回i,因为此时的i恰好就是模式串第一个元素当前所处的位置。
所谓“字符串截取”,顾名思义,就是从原字符串中截取某一部分的内容,有以下两种方法:
通常我们用的是第一种方法,代码如下:
- /**
- *********************
- * Get a substring
- *
- * @param paraString The given substring.
- * @param paraStartPosition The start position in the original string.
- * @param paraLength The length of the new string.
- * @return The first position. -1 for no matching.
- *********************
- */
- public MyString substring(int paraStartPosition, int paraLength) {
- if (paraStartPosition + paraLength > length) {
- System.out.println("The bound is exceeded.");
- return null;
- } // Of if
-
- MyString resultMyString = new MyString();
- resultMyString.length = paraLength;
- for (int i = 0; i < paraLength; i++) {
- resultMyString.data[i] = data[paraStartPosition + i];
- } // Of for i
-
- return resultMyString;
- } // Of substring
当然,在进行字符串截取之前,还需要提前判断我们想要截取的内容是否超出了原字符串的范围。这里我们通过if语句进行判断,当paraStartPosition + paraLength > length(即截取内容已超出原字符串的范围)时,输出" The bound is exceeded.",并返回null。如果没有超出范围,则从截取内容的开头下标开始遍历,将内容copy到结果字符串中。
接着,我们进行如下的数据测试(注意,空格、标点符号等也算一个字符):
- /**
- *********************
- * The entrance of the program.
- *
- * @param args Not used now.
- *********************
- */
- public static void main(String args[]) {
- MyString tempFirstString = new MyString("I like ik.");
- MyString tempSecondString = new MyString("ik");
- int tempPosition = tempFirstString.locate(tempSecondString);
- System.out.println("The position of \"" + tempSecondString + "\" in \"" + tempFirstString
- + "\" is: " + tempPosition);
-
- MyString tempThirdString = new MyString("ki");
- tempPosition = tempFirstString.locate(tempThirdString);
- System.out.println("The position of \"" + tempThirdString + "\" in \"" + tempFirstString
- + "\" is: " + tempPosition);
-
- tempThirdString = tempFirstString.substring(1, 2);
- System.out.println("The substring is: \"" + tempThirdString + "\"");
-
- tempThirdString = tempFirstString.substring(5, 5);
- System.out.println("The substring is: \"" + tempThirdString + "\"");
-
- tempThirdString = tempFirstString.substring(5, 6);
- System.out.println("The substring is: \"" + tempThirdString + "\"");
- } // Of main
- package datastructure;
-
- /**
- * My string. String is a class provided by the language, so I use another name.
- * It is essentially a sequential list with char type elements.
- *
- *@auther Xin Lin 3101540094@qq.com.
- */
-
- public class MyString {
- /**
- * The maximal length.
- */
- public static final int MAX_LENGTH = 10;
-
- /**
- * The actual length.
- */
- int length;
-
- /**
- * The data.
- */
- char[] data;
-
- /**
- *********************
- * Construct an empty char array.
- *********************
- */
- public MyString() {
- length = 0;
- data = new char[MAX_LENGTH];
- }// Of the first constructor
-
- /**
- *********************
- * Construct using a system defined string.
- *
- * @param paraString
- * The given string. Its length should not exceed MAX_LENGTH - 1.
- *********************
- */
- public MyString(String paraString) {
- data = new char[MAX_LENGTH];
- length = paraString.length();
-
- // Copy data.
- for (int i = 0; i < length; i++) {
- data[i] = paraString.charAt(i);
- } // Of for i
- } // Of the second constructor
-
- /**
- *********************
- * Overrides the method claimed in Object, the superclass of any class.
- *********************
- */
- public String toString() {
- String resultString = "";
-
- for (int i = 0; i < length; i++) {
- resultString += data[i];
- } // Of for i
-
- return resultString;
- } // Of toString
-
- /**
- *********************
- * Locate the position of a substring.
- *
- * @param paraString The given substring.
- * @return The first position. -1 for no matching.
- *********************
- */
- public int locate(MyString paraMyString) {
- boolean tempMatch = false;
- for (int i = 0; i < length - paraMyString.length + 1; i++) {
- // Initialize.
- tempMatch = true;
- for (int j = 0; j < paraMyString.length; j++) {
- if (data[i + j] != paraMyString.data[j]) {
- tempMatch = false;
- break;
- } // Of if
- } // Of for j
-
- if (tempMatch) {
- return i;
- } // Of if
- } // Of for i
- return -1;
- } // Of locate
-
- /**
- *********************
- * Get a substring
- *
- * @param paraString The given substring.
- * @param paraStartPosition The start position in the original string.
- * @param paraLength The length of the new string.
- * @return The first position. -1 for no matching.
- *********************
- */
- public MyString substring(int paraStartPosition, int paraLength) {
- if (paraStartPosition + paraLength > length) {
- System.out.println("The bound is exceeded.");
- return null;
- } // Of if
-
- MyString resultMyString = new MyString();
- resultMyString.length = paraLength;
- for (int i = 0; i < paraLength; i++) {
- resultMyString.data[i] = data[paraStartPosition + i];
- } // Of for i
-
- return resultMyString;
- } // Of substring
-
- /**
- *********************
- * The entrance of the program.
- *
- * @param args Not used now.
- *********************
- */
- public static void main(String args[]) {
- MyString tempFirstString = new MyString("I like ik.");
- MyString tempSecondString = new MyString("ik");
- int tempPosition = tempFirstString.locate(tempSecondString);
- System.out.println("The position of \"" + tempSecondString + "\" in \"" + tempFirstString
- + "\" is: " + tempPosition);
-
- MyString tempThirdString = new MyString("ki");
- tempPosition = tempFirstString.locate(tempThirdString);
- System.out.println("The position of \"" + tempThirdString + "\" in \"" + tempFirstString
- + "\" is: " + tempPosition);
-
- tempThirdString = tempFirstString.substring(1, 2);
- System.out.println("The substring is: \"" + tempThirdString + "\"");
-
- tempThirdString = tempFirstString.substring(5, 5);
- System.out.println("The substring is: \"" + tempThirdString + "\"");
-
- tempThirdString = tempFirstString.substring(5, 6);
- System.out.println("The substring is: \"" + tempThirdString + "\"");
- } // Of main
- } // Of class MyString
运行结果
显然,运行结果与预测结果保持一致。
总的来说,字符串不仅是计算机处理信息的基础,还涉及到很多应用程序的核心功能,所以掌握字符串及其相关操作对编程学习是非常重要的。此外,对于今天的字符串匹配操作,我们采用的是BF暴力算法,但其实还有KMP算法、Rabin-Karp算法、Boyer-Moore算法等。