

我已经在c#中编写了一个自定义扩展方法,它是对扩展方法string [] getBetweenAll(string source,string startstring,string endstring)的改进.

I've written a custom extension method in c# that is an improvement of the extensionmethod string[] getBetweenAll(string source, string startstring, string endstring);


Originally this extensionmethod found all substrings between two strings, for example:

string source = "<1><2><3><4>";
source.getBetweenAll("<", ">");
//output: string[] {"1", "2", "3", "4"}


But if you had another occurrence of < in the beginning it would just get between that and the whole string

string source = "<<1><2><3><4>";
source.getBetweenAll("<", ">");
//output: string[] {"<1><2><3><4"}


So I re-wrote it to be more exact and search backwards from ">" to find the first occurrence of "<"


Now I got it working, but the problem here is that it is way too slow because the search method skips back every character of the whole string for each occurrence. Do you know how I could improve the speed of this function? Or is it not possible?

这是到目前为止的完整代码 http://pastebin.com/JEZmyfSG 我在需要改进代码的地方添加了注释

Here is the entire code so far http://pastebin.com/JEZmyfSGI've added comments where the code needs speed improvement

public static List<int> IndexOfAll(this string main, string searchString)
    List<int> ret = new List<int>();
    int len = searchString.Length;
    int start = -len;
    while (true)
        start = main.IndexOf(searchString, start + len);
        if (start == -1)
    return ret;

public static string[] getBetweenAll(this string main, string strstart, string strend, bool preserve = false)
    List<string> results = new List<string>();
    List<int> ends = main.IndexOfAll(strend);
    foreach (int end in ends)
        int start = main.previousIndexOf(strstart, end);  //This is where it has to search the whole source string every time
        results.Add(main.Substring(start, end - start) + (preserve ? strend : string.Empty));
    return results.ToArray();

//This is the slow function (depends on main.Length)
public static int previousIndexOf(this string main, string find, int offset)
    int wtf = main.Length ;
    int x = main.LastIndexOf(find, wtf);
    while (x > offset)
        x = main.LastIndexOf(find, wtf);
        wtf -= 1;
    return x;

我想另一种做PreviousIndexOf(string,int searchfrom)的方法;这样可以提高速度.像IndexOf()一样,除了向后并具有提供的起始偏移量



As the original GetBetweenAll, we can use a regular expression. To match only the shortest "inner" appearances of the enclosing strings, we have to use a negative lookahead on the start string and a non-greedy quantifier for the content.

public static string[] getBetweenAll(this string main,
    string strstart, string strend, bool preserve = false)
    List<string> results = new List<string>();

    string regularExpressionString = string.Format("{0}(((?!{0}).)+?){1}",
        Regex.Escape(strstart), Regex.Escape(strend));
    Regex regularExpression = new Regex(regularExpressionString, RegexOptions.IgnoreCase);

    var matches = regularExpression.Matches(main);

    foreach (Match match in matches)
        if (preserve)

    return results.ToArray();


09-03 06:15