regex in c#

שרה רחל

מנסה לכתוב קוד שמקבל רצף תוים ומחלק למילים ואם מילה יותר מ10 תוים
אז שתתחלק למילים בנות 10 תוים
למשל
hello now is ten av abcdefghijklmnopqrstuvwxyz
יוצא:

hello
now
is
ten
av
abcdefghij
klmnopqrst
uvwxyz
מה לא נכון כאן:

string[] words = Regex.Split(sWord, @"\W+|/\w{10}");
                            foreach (var word in words)
                            {
                           
                  
                                  Console.WriteLine("WORD: " + words);

dovid

@שרה-רחל ראשית בקוד שלך יש כפל של escape, כי גם שמת @ וגם לוכסן הפוך, אז יוצא שהRegex מחפש את הטקסט \wwwwwwwwwwww. (לוכסן שאחריו 10 w).
אם נתקן את הבעיה הזאת נקבל תוצאה קרובה למה שאת רוצה:

hello 
now 
is 
ten 
av 
  
  
uvwxyz

התוצאות חולקו נכון, אך שנמטו שניים.
למה? כי הרג'קס לא כולל בתוצאות את המפריד עצמו, ופה המפריד הוא כל ה10 אותיות...
כדי לכלול את המפריד אפשר לעטוף אותו בסוגריים (איזה קוסמות!) אבל ניסיתי במקרה שלנו והתוצאה משום מה כזו:

hello 
now 
is 
ten 
av 
  
abcdefghij 
  
klmnopqrst 
uvwxyz

אני הייתי מממש את העניין ללא רג'קס עם פונקציה ייעודית, למשל:

List<string> SplitByWordsAndLength(string input, int len = 10)
{
    var list = new List<string>();
    var pos = -1;

    for (var counter = 0; counter < input.Length; counter++)
    {
        if (char.IsSeparator(input[counter]))
        {
            if (pos != -1)
                list.Add(input.Substring(pos, counter - pos));
            pos = -1;
        }
        else
        {
            if (pos == -1) pos = counter;
            else if ((counter - pos) == len)
            {
                list.Add(input.Substring(pos, counter - pos));
                pos = counter;
            }
        }
    }

    if (pos != -1)
        list.Add(input.Substring(pos));

    return list;
}

וזה בטעם מודרני יותר:

IEnumerable<string> SplitByWordsAndLength(string input, int len = 10)
{
    var pos = -1;
    for (var counter = 0; counter < input.Length; counter++)
        if (char.IsSeparator(input[counter]))
        {
            if (pos != -1)
                yield return input[pos..counter];
            pos = -1;
        }
        else
        {
            if (pos == -1) pos = counter;
            else if ((counter - pos) == len)
            {
                yield return input[pos..counter];
                pos = counter;
            }
        }


    if (pos != -1)
        yield return input[pos..];
}

תחומים - פורום חרדי מקצועי

regex in c#