Announcement

Collapse
No announcement yet.

regex question (Asher?)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • regex question (Asher?)

    I have a piece of a sed script that is working correctly and I'm not sure why. This bothers me.

    The replacement rule is supposed to change a function of the form:

    Code:
    log(foo,fmt,arg1,arg2)
    into:

    Code:
    log(foo,fmt,(cast)arg1,(cast)arg2)
    This is my rule:
    Code:
    s/\(foo.*,.*\),\(.*\),\(.*\)/\1,(cast)\2,(cast)\3/
                  ^
    If I execute:
    Code:
    echo 'log(foo,"omg,wtf",arg1,arg2)' | sed 's/\(foo.*,.*\),\(.*\),\(.*\)/\1,(cast)\2,(cast)\3/'
                           ^
    I get:
    Code:
    log(foo,"omg,wtf",(cast)arg1,(cast)arg2)
    I don't understand why the marked comma in the rule is matching the marked comma in the input, rather than the preceding comma. Does sed just try to find the longest possible match or something?

  • #2
    Yes, yes it does. Use lazy quantifiers, like *?
    Graffiti in a public toilet
    Do not require skill or wit
    Among the **** we all are poets
    Among the poets we are ****.

    Comment


    • #3
      I'm perfectly happy with it behaving like this; I just wanted to make sure it would behave this way consistently before I ran the script on ~1MB of source code.

      Comment


      • #4
        I don't understand why the marked comma in the rule is matching the marked comma in the input, rather than the preceding comma.


        .* means "any character, one or more number of times"

        Comma is a character, so foo.* matches foo,

        Comment


        • #5
          "I have a piece of a sed script that is working correctly and I'm not sure why. This bothers me."



          To a layman like myself, that was funny.
          Life is not measured by the number of breaths you take, but by the moments that take your breath away.
          "Hating America is something best left to Mobius. He is an expert Yank hater.
          He also hates Texans and Australians, he does diversify." ~ Braindead

          Comment


          • #6
            Originally posted by VetLegion View Post
            I don't understand why the marked comma in the rule is matching the marked comma in the input, rather than the preceding comma.


            .* means "any character, one or more number of times"

            Comma is a character, so foo.* matches foo,
            Has it occurred to you that I couldn't have possibly written the OP without understanding that?

            My question was not 'why does this match' because I know that it matches; my question was 'why did it pick this particular match, given that the answer is ambiguous, and will it work this way consistently'.

            Comment


            • #7
              What is ambiguous?

              Comment


              • #8
                Sorry, I just skimmed your question. Onodera is probably right.

                Comment


                • #9
                  Originally posted by VetLegion View Post
                  What is ambiguous?
                  Not fully specified.

                  e.g. given the string 'foo', '.*' matches '', 'f', 'o', 'fo', 'oo', and 'foo', thus it is ambiguous.

                  Comment


                  • #10
                    Of course, the regexp will misbehave if arg1 or arg2 contains a comma.
                    http://www.hardware-wiki.com - A wiki about computers, with focus on Linux support.

                    Comment


                    • #11
                      Yes. Thankfully, I'm pretty sure that's never the case.

                      Comment


                      • #12
                        By default, if there are multiple matches possible for a regex, it matches the longest one.

                        Comment


                        • #13
                          It would have been simpler and faster to quote my hypothesis and answer "yes"

                          Comment

                          Working...
                          X