Skip to search.

Breaking News Visit Yahoo! News for the latest.

×Close this window

ntb-scripts · The NoteTab Scripts Group

The Yahoo! Groups Product Blog

Check it out!

Group Information

  • Members: 211
  • Category: Software
  • Founded: Aug 29, 2002
  • Language: English
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Messages

Advanced
Messages Help
Messages 562 - 591 of 591   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Show Message Summaries Sort by Date ^  
#562 From: "John Shotsky" <jshotsky@...>
Date: Thu May 17, 2012 1:04 am
Subject: Assertion behavior - was - RE: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?
shotsky1
Send Email Send Email
 
Flo (and all)
Attempting to use the negative assertion function discussed below, but with
bewildering results.
I have two versions of a clip. One works as expected, the other doesn't. I can't
see why the difference.
Here are the two versions. To run them, just comment one out and run the other.
;^!Replace "^(Notes)(?!::)" >> "Dir::$1" ARSW
^!Replace "^Notes(?!::)" >> "Dir::" ARSW

The goal is to place a Dir:: tag in front of any Notes tag that is not followed
by [::].
Here is the text sample:
Notes::Note1: The traditional way to prepare this dish does not call for the
removal of the vein from the prawn.

It appears to work in v6.2, but it was a bit erratic - there were times when it
didn't work, but it seems to be working
correctly now. I will be retesting to see if I can get it to fail again. But in
version 7, the first clip always fails
the negative assertion and thus places the Dir:: tag when it shouldn't.

Can anyone see anything I'm doing wrong, or is this a regex bug?

Regards,
John
RecipeTools Web Site: http://recipetools.gotdns.com/


-----Original Message-----
From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of John Shotsky
Sent: Thursday, May 10, 2012 09:13
To: ntb-scripts@yahoogroups.com
Subject: RE: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?

Flo,
Thank you. Always nice to learn something new. I will play around with this
until I have it fully internalized. I have
needed this function quite a few times and have 'tokenized' and then used a
character class instead. (And then
untokenized.) This is obviously a better way to do it.

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of flo.gehrke
Sent: Thursday, May 10, 2012 07:46
To: ntb-scripts@yahoogroups.com
Subject: Re: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?


--- In ntb-scripts@yahoogroups.com <mailto:ntb-scripts%40yahoogroups.com> ,
"John Shotsky" <jshotsky@...> wrote:
>
> I am not understanding something here – The criteria was:
> three characters that are *NOT* MON|TUE|WED|THU|FRI|SAT|SUN
>
> How is this avoiding those strings? I've wanted to do this text
> that didn't contain a certain string on multiple occasions.
>

John,

The second part of that RegEx...

\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)

matches an opening and a closing literal bracket '(...)' embracing three digits
'.{3}' that are NOT 'Mon', 'Tue' etc, as
Joy demanded.

The 3-digit-days are excluded with a Negative Lookahead. Since a Lookahead
doesn't consume any character, any different
3-digit-string will match at the same position between the opening and the
closing bracket. That's why, for example,..

'John' is matched with '(?!Mary)John'

that is: Find 'John' at a position where you don't see 'Mary' when looking
ahead.

Regards,
Flo



[Non-text portions of this message have been removed]



------------------------------------

Fookes Software: http://www.fookes.com/
NoteTab website: http://www.notetab.com/
NoteTab Discussion Lists: http://www.notetab.com/groups.php

***
Yahoo! Groups Links

#563 From: "John Shotsky" <jshotsky@...>
Date: Thu May 17, 2012 2:38 am
Subject: RE: [Clip] Assertion behavior - was - RE: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?
shotsky1
Send Email Send Email
 
Further testing shows that I didn't have the clips exactly the same in 6.2 and
7. In one case there is an [s?]on the end
of notes, and in the other case there isn't. It is when the question mark is
present that the failure occurs in both 6.2
and 7. But still, I would have expected it to work, especially because the 's'
is present - so that condition should
have been satisfied before the look behind started looking. So, the question for
me, then, is how do I prevent this
erroneous action in the lookbehind, since the s? term must be present?

These are the two clips as they produce the error:
;^!Replace "^(Notes?)(?!::)" >> "Dir::$1" ARSW
^!Replace "^Notes?(?!::)" >> "Dir::" ARSW
Regards,
John
RecipeTools Web Site: http://recipetools.gotdns.com/


-----Original Message-----
From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of
John Shotsky
Sent: Wednesday, May 16, 2012 18:05
To: ntb-clips@yahoogroups.com
Cc: ntb-scripts@yahoogroups.com
Subject: [Clip] Assertion behavior - was - RE: [NTS] Can a Reg Exp handle 123
AND not a|b|c followed by x?

Flo (and all)
Attempting to use the negative assertion function discussed below, but with
bewildering results.
I have two versions of a clip. One works as expected, the other doesn't. I can't
see why the difference.
Here are the two versions. To run them, just comment one out and run the other.
;^!Replace "^(Notes)(?!::)" >> "Dir::$1" ARSW
^!Replace "^Notes(?!::)" >> "Dir::" ARSW

The goal is to place a Dir:: tag in front of any Notes tag that is not followed
by [::].
Here is the text sample:
Notes::Note1: The traditional way to prepare this dish does not call for the
removal of the vein from the prawn.

It appears to work in v6.2, but it was a bit erratic - there were times when it
didn't work, but it seems to be working
correctly now. I will be retesting to see if I can get it to fail again. But in
version 7, the first clip always fails
the negative assertion and thus places the Dir:: tag when it shouldn't.

Can anyone see anything I'm doing wrong, or is this a regex bug?

Regards,
John
RecipeTools Web Site: http://recipetools.gotdns.com/


-----Original Message-----
From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of John Shotsky
Sent: Thursday, May 10, 2012 09:13
To: ntb-scripts@yahoogroups.com
Subject: RE: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?

Flo,
Thank you. Always nice to learn something new. I will play around with this
until I have it fully internalized. I have
needed this function quite a few times and have 'tokenized' and then used a
character class instead. (And then
untokenized.) This is obviously a better way to do it.

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of flo.gehrke
Sent: Thursday, May 10, 2012 07:46
To: ntb-scripts@yahoogroups.com
Subject: Re: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?


--- In ntb-scripts@yahoogroups.com <mailto:ntb-scripts%40yahoogroups.com> ,
"John Shotsky" <jshotsky@...> wrote:
>
> I am not understanding something here – The criteria was:
> three characters that are *NOT* MON|TUE|WED|THU|FRI|SAT|SUN
>
> How is this avoiding those strings? I've wanted to do this text
> that didn't contain a certain string on multiple occasions.
>

John,

The second part of that RegEx...

\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)

matches an opening and a closing literal bracket '(...)' embracing three digits
'.{3}' that are NOT 'Mon', 'Tue' etc, as
Joy demanded.

The 3-digit-days are excluded with a Negative Lookahead. Since a Lookahead
doesn't consume any character, any different
3-digit-string will match at the same position between the opening and the
closing bracket. That's why, for example,..

'John' is matched with '(?!Mary)John'

that is: Find 'John' at a position where you don't see 'Mary' when looking
ahead.

Regards,
Flo



[Non-text portions of this message have been removed]



------------------------------------

Fookes Software: http://www.fookes.com/
NoteTab website: http://www.notetab.com/
NoteTab Discussion Lists: http://www.notetab.com/groups.php

***
Yahoo! Groups Links





------------------------------------

Fookes Software: http://www.fookes.com/
NoteTab website: http://www.notetab.com/
NoteTab Discussion Lists: http://www.notetab.com/groups.php

***
Yahoo! Groups Links

#564 From: "flo.gehrke" <flo.gehrke@...>
Date: Thu May 17, 2012 7:17 pm
Subject: Re: [NTS] Redirecting GAWK Output
flo.gehrke
Send Email Send Email
 
--- In ntb-scripts@yahoogroups.com, Robert Bull <barlennan@...> wrote:
>
> Kees' script does the job (...) I have a section titled Scripts in
> my clipbook, beneath the section of ordinary clips.  In the Scripts
> section I have a clip entitled "Kees AWK script".  It contains my
> version of Kees' script:
>
> Kees AWK script
> ----------------
> BEGIN { max = 0 + 0 }
> # echo input to original file
> {
>         print
> }
> # the rest of the script
> {
> if(length($0) > max){
>         max = length($0)
> }
> }
> END { printf("max line length is %d", max) }
> ----------------
> Notice there aren't any references to files or file names at all.

Thanks, Robert!

I see that last version working and getting to the result that you had in mind.

Actually, I never talked of inserting the result into the active document. The
goal was...

- To have that list (source text) open

- to run a clip which calls GAWK and starts a GAWK script which calculates the
length of the longest line

- to redirect the result from GAWK back into the NT clip

- to continue clip execution with that data, i.e., in this case, just the value
'16'.

In this context, I mentioned the clipboard only as a target for redirecting the
GAWK output. The idea was, to continue clip execution, for example, with
displaying the result in an infobox that could call the clipboard with...

^!Info The longest line has ^$GetClipboard$ characters

Meanwhile, we saw approaches how to redirect the GAWK output to the clipboard
(with MS Clip.exe). Also, how to direct the input into GAWK from the clipboard
(with gclip.exe from UnxUtils). This could be of interest in cases where we send
data from NT to the clipboard, process them with GAWK, and redirect the result
via the clipboard back into a clip execution.

So far, we didn't see a solution for this redirection (to clipboard/from
clipboard). That's why I remembered my message #557.

If you still have the patience to spend some time with that subject -- please
have a look at the latest messages. It's all in the NT Scripts Group on Yahoo --
or in your incoming emails ;-)

Regards,
Flo

#565 From: Robert Bull <barlennan@...>
Date: Thu May 17, 2012 8:34 pm
Subject: Re: [NTS] Redirecting GAWK Output
barlennan
Send Email Send Email
 
Hello, flo.gehrke;

Thursday, May 17, 2012, 8:17:01 PM, you wrote:

fg> I see that last version working and getting to the result that you
fg> had in mind.

What a pity it wasn't what *you* had in mind. I might have known I'd
get the wrong end of the stick... still, it was a fun play session.


--
Regards,

Robert Bull
   mailto:barlennan@...

#566 From: Kees Nuyt <k.nuyt@...>
Date: Thu May 17, 2012 8:51 pm
Subject: Re: [NTS] Re: Redirecting GAWK Output
knuyt
Send Email Send Email
 
On Mon, 14 May 2012 16:17:09 -0000, you wrote:

>   Why don't I see the lines output by PRINT?
>
>
> GAWK Subroutine called in NT clip:
>  This leaves all the original document lines followed by 16 on the last line.
All text is selected by clip.
> # Find size of longest line
> {
>  print $0
>  if (length($0) > max) { max = length($0) }
> }
> END { print max }
>
>
> SCRIPT.AWK
> ----------
> # Find size of longest line
> {
>  PRINT $0
>  if (length($0) > max) { max = length($0) }
> }
> END { print max }

PRINT is not the same as print .
gawk is case-sensitive.


--
Regards,

Kees Nuyt

#567 From: "mycroftj" <mycroftj@...>
Date: Sun May 27, 2012 11:07 pm
Subject: Why no output from GAWK END clause and no redirect file created?
mycroftj
Send Email Send Email
 
I have been at this all day and am still stuck.
I want to have a clip run GAWK code from an external file against an external
data file and send all the output to a new external data file.

I cannot get an output data file to be created and the output from the END
clause is never captured when I obtain the program output into a NT document.

I've tried GetInputOutput, GetOutput, GetDosOutput, Run and Shell in various
forms.

Following is a clip that creates a small data file and a small code file and
tries to capture the output to a new file. I've put a sample of the
GetInputOutput and RUn commands in. Comment them out one by one, etc.

If anyone can get it to work, I would LOVE to know what I am doing wrong.

The output should be
Begin Clause
Bob
End Clause

The small files are written to your temp folder and will be deleted when you run
any clean up routines on your PC. If you don't ever do that, two additional tiny
files are not going to make any difference!

Thanks so much.

Joy


^!Continue This will create a small code and data file in your temp folder.
^%NL% I'm trying to find a way to execute the code in the external file using
the external data and routing all output to another external file. ^%NL%^%NL% It
works when executing the Gawk.exe\ ... > output_file commands in DOS
mode.^%NL%^%NL% Run this from an empty document.

;^!Setdebug ON

; Create data file
^!TextToFile "^$ExpandEnv(%TEMP%)$\Input_Data.txt" Mike^%NL%Bob^%NL%Mary^%NL%Sue

; create GAWK script code file
^!TextToFile "^$ExpandEnv(%TEMP%)$\GAWK_Code.txt" BEGIN { print "Begin Clause"
}^%NL%/Bob/ { print $0 }^%NL%END { printf("End Clause\n") }

; This runs but creates no output or output file
;^!Run ^$GetGawkExe$ -f ^$ExpandEnv(%TEMP%)$\GAWK_Code.txt
^$ExpandEnv(%TEMP%)$\Input_Data.txt > ^$ExpandEnv(%TEMP%)$\output.txt

; This puts
;Begin Clause
;Bob
;in my current document.  Where is the END statement output?
; Why no output file created?
;^$GetInputOutput(^$GetGawkExe$ -f ^$ExpandEnv(%TEMP%)$\GAWK_Code.txt
^$ExpandEnv(%TEMP%)$\Input_Data.txt > ^$ExpandEnv(%TEMP%)$\output.txt)$

^!prompt Done

#568 From: "mycroftj" <mycroftj@...>
Date: Mon May 28, 2012 6:28 pm
Subject: Re: Why no output from GAWK END clause and no redirect file created?
mycroftj
Send Email Send Email
 
Found an answer. An idea came to me this morning. After playing with all the
fancy DOS NT commands, just plain ^!"^%GAWK_Executable%" -f ... worked. I'm sure
somebody would have told me this eventually.

Joy



--- In ntb-scripts@yahoogroups.com, "mycroftj" <mycroftj@...> wrote:
>
> I have been at this all day and am still stuck.
> I want to have a clip run GAWK code from an external file against an external
data file and send all the output to a new external data file.
>
> I cannot get an output data file to be created and the output from the END
clause is never captured when I obtain the program output into a NT document.
>
> I've tried GetInputOutput, GetOutput, GetDosOutput, Run and Shell in various
forms.
>
> Following is a clip that creates a small data file and a small code file and
tries to capture the output to a new file. I've put a sample of the
GetInputOutput and RUn commands in. Comment them out one by one, etc.
>
> If anyone can get it to work, I would LOVE to know what I am doing wrong.
>
> The output should be
> Begin Clause
> Bob
> End Clause
>
> The small files are written to your temp folder and will be deleted when you
run any clean up routines on your PC. If you don't ever do that, two additional
tiny files are not going to make any difference!
>
> Thanks so much.
>
> Joy
>
>
> ^!Continue This will create a small code and data file in your temp folder.
^%NL% I'm trying to find a way to execute the code in the external file using
the external data and routing all output to another external file. ^%NL%^%NL% It
works when executing the Gawk.exe\ ... > output_file commands in DOS
mode.^%NL%^%NL% Run this from an empty document.
>
> ;^!Setdebug ON
>
> ; Create data file
> ^!TextToFile "^$ExpandEnv(%TEMP%)$\Input_Data.txt"
Mike^%NL%Bob^%NL%Mary^%NL%Sue
>
> ; create GAWK script code file
> ^!TextToFile "^$ExpandEnv(%TEMP%)$\GAWK_Code.txt" BEGIN { print "Begin Clause"
}^%NL%/Bob/ { print $0 }^%NL%END { printf("End Clause\n") }
>
> ; This runs but creates no output or output file
> ;^!Run ^$GetGawkExe$ -f ^$ExpandEnv(%TEMP%)$\GAWK_Code.txt
^$ExpandEnv(%TEMP%)$\Input_Data.txt > ^$ExpandEnv(%TEMP%)$\output.txt
>
> ; This puts
> ;Begin Clause
> ;Bob
> ;in my current document.  Where is the END statement output?
> ; Why no output file created?
> ;^$GetInputOutput(^$GetGawkExe$ -f ^$ExpandEnv(%TEMP%)$\GAWK_Code.txt
^$ExpandEnv(%TEMP%)$\Input_Data.txt > ^$ExpandEnv(%TEMP%)$\output.txt)$
>
> ^!prompt Done
>

#569 From: Art Kocsis <artkns@...>
Date: Sun Sep 23, 2012 9:38 am
Subject: Finding Pairwise Matches
artkns
Send Email Send Email
 
I am past pulling hair and am now down to scalp and it's getting bloody so
maybe someone here can help.

I am trying to replace all commas between matched pairs of double quotes.
My first cut worked like a champ. However, it also removed the commas from
between the unmatched pairs of quotes as well. By matched pairs I mean for
any given line, the quotes are taken in pairs (1&2, 3&4, etc). Text between
quotes 2&3, 4&5, etc should be ignored. There can be any number of matched
pairs in a line  and any number of commas between any matched or unmatched
pair of quotes.

Sample text:
nnnnnnnnn,"xxxx",,,"ss,ss,",xxx
nnnnnnnnn,"xx,xx",,"ss,ss",xxx
nnnnnnnnn,xxxx,,"ss,ss,"xxx

First clip:
:Loop1
^!Replace "\"(.*?)\,(.*?)\"" >> "\"$1§$2\"" AIRSTW
^!IfError Next Else Loop1

I have tried just about everything to force pair wise matching but to no avail.

This pattern, (\"[^\"\,]*\") correctly finds all matched quote pairs
without embedded commas but attempting to use it as a look behind assertion
has mixed results.

(\"[^\"\,]*\")+\K.  Works fine but lines may not always contain a matching
pattern
(\"[^\"\,]*\")*\K.   Switching to a "*" quantifier destroys the entire
assertion pattern.

Only the "." matches. Why is the pattern/quantifier not greedy? The default
is supposed to be  greedy.

Using the two look behind assertions:

(\"[^\"\,]*\")+?.*?\K(\"(.*?)\,(.*?)\")

Correctly finds all look behind assertions but skips lines like line#2

(\"[^\"\,]*\")*?.*?\K(\"(.*?)\,(.*?)\")

Again switching quantifiers allows the RegEx engine to take the zero
instance option just use the pattern after the \K. It incorrectly matches
the first three quotes. Again, why is the assertion not greedy?

My eyes and head are going in circles. Any help would be appreciated. How
an I force RegEx to use the assertion pattern when it exists?

Namaste',   Art

#570 From: Axel Berger <Axel-Berger@...>
Date: Sun Sep 23, 2012 10:19 am
Subject: Re: [NTS] Finding Pairwise Matches
absalom_nemini
Send Email Send Email
 
Art Kocsis wrote:
> I am trying to replace all commas between matched pairs of double quotes.

I other words all those inside quotes. I have not tried anything, but I
believe ^!SetArray deals with this problem correctly, so you can then
then delete inside the array elements. Two nested loops, won't be
blindingly fast.

Axel

#571 From: "John Shotsky" <jshotsky@...>
Date: Sun Sep 23, 2012 1:05 pm
Subject: RE: [NTS] Finding Pairwise Matches
shotsky1
Send Email Send Email
 
Art,
;Replace first double quote with opening bracket
^!Replace "^[^\r\n\"]*\K\"" >> "[" ARSW
;Replace next double quote with closing bracket
^!Replace "^[^\r\n\"]*\K\"" >> "]" ARSW
;Repeat as long as double quotes exist
^!IfError Next Else Skip_-2
;Replace commas between opening and closing brackets
^!Replace "\[[^\r\n,\]]*\K,(?=.*\])" >> "" ARSW
^!IfError Next Else Skip_-1
;Change brackets back to double quotes
^!Replace "[\[\]]" >> "\"" ARSW
^!IfError Next Else Skip_-1
Should be pretty fast.
=============================
nnnnnnnnn,"xxxx",,,"ssss",xxx
nnnnnnnnn,"xxxx",,"ssss",xxx
nnnnnnnnn,xxxx,,"ssss"xxx

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of Art Kocsis
Sent: Sunday, September 23, 2012 02:39
To: NoteTab-Scripts
Subject: [NTS] Finding Pairwise Matches


I am past pulling hair and am now down to scalp and it's getting bloody so
maybe someone here can help.

I am trying to replace all commas between matched pairs of double quotes.
My first cut worked like a champ. However, it also removed the commas from
between the unmatched pairs of quotes as well. By matched pairs I mean for
any given line, the quotes are taken in pairs (1&2, 3&4, etc). Text between
quotes 2&3, 4&5, etc should be ignored. There can be any number of matched
pairs in a line and any number of commas between any matched or unmatched
pair of quotes.

Sample text:
nnnnnnnnn,"xxxx",,,"ss,ss,",xxx
nnnnnnnnn,"xx,xx",,"ss,ss",xxx
nnnnnnnnn,xxxx,,"ss,ss,"xxx

First clip:
:Loop1
^!Replace "\"(.*?)\,(.*?)\"" >> "\"$1§$2\"" AIRSTW
^!IfError Next Else Loop1

I have tried just about everything to force pair wise matching but to no avail.

This pattern, (\"[^\"\,]*\") correctly finds all matched quote pairs
without embedded commas but attempting to use it as a look behind assertion
has mixed results.

(\"[^\"\,]*\")+\K. Works fine but lines may not always contain a matching
pattern
(\"[^\"\,]*\")*\K. Switching to a "*" quantifier destroys the entire
assertion pattern.

Only the "." matches. Why is the pattern/quantifier not greedy? The default
is supposed to be greedy.

Using the two look behind assertions:

(\"[^\"\,]*\")+?.*?\K(\"(.*?)\,(.*?)\")

Correctly finds all look behind assertions but skips lines like line#2

(\"[^\"\,]*\")*?.*?\K(\"(.*?)\,(.*?)\")

Again switching quantifiers allows the RegEx engine to take the zero
instance option just use the pattern after the \K. It incorrectly matches
the first three quotes. Again, why is the assertion not greedy?

My eyes and head are going in circles. Any help would be appreciated. How
an I force RegEx to use the assertion pattern when it exists?

Namaste', Art



[Non-text portions of this message have been removed]

#572 From: Art Kocsis <artkns@...>
Date: Thu Sep 27, 2012 9:23 pm
Subject: Re: [NTS] Finding Pairwise Matches
artkns
Send Email Send Email
 
At 9/23/2012 03:19 AM, Axel wrote:
>Art Kocsis wrote:
> > I am trying to replace all commas between matched pairs of double quotes.
>I other words all those inside quotes. I have not tried anything, but I
>believe ^!SetArray deals with this problem correctly, so you can then
>then delete inside the array elements. Two nested loops, won't be
>blindingly fast.

Thank you, Axel, for responding.

Although I am not sure exactly what you had in mind using arrays I know
there are many ways using clip commands to parse the lines for the matching
double quotes. However, my goal and desire was to do the parsing,
substitution and removals just using RegEx. Speed is not an issue but
elegance, compactness and maintaining/expanding my skills in RegEx is.

Art

#573 From: Art Kocsis <artkns@...>
Date: Fri Sep 28, 2012 5:48 am
Subject: RE: [NTS] Finding Pairwise Matches
artkns
Send Email Send Email
 
>From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On
>Behalf Of Art Kocsis
>Sent: Sunday, September 23, 2012 02:39
>To: NoteTab-Scripts
>Subject: [NTS] Finding Pairwise Matches
>I am past pulling hair and am now down to scalp and it's getting bloody so
>maybe someone here can help.
>Art,
>
At 9/23/2012 06:05 AM, Jophn wrote:
>;Replace first double quote with opening bracket
>^!Replace "^[^\r\n\"]*\K\"" >> "[" ARSW
>;Replace next double quote with closing bracket
>^!Replace "^[^\r\n\"]*\K\"" >> "]" ARSW
>;Repeat as long as double quotes exist
>^!IfError Next Else Skip_-2
>;Replace commas between opening and closing brackets
>^!Replace "\[[^\r\n,\]]*\K,(?=.*\])" >> "" ARSW
>^!IfError Next Else Skip_-1
>;Change brackets back to double quotes
>^!Replace "[\[\]]" >> "\"" ARSW
>^!IfError Next Else Skip_-1
>Should be pretty fast.
>=============================
>nnnn,"xx,xx",,ss,ss,ssss
>nnnn,"xx,xx",,"xx,,xx",ssss
>nnnn,"yyyy",,,"yyyy",ssss
>nnnn,"yyyy",,"yyyy",,,"xx,xx,",sss
>nnnn,"yyyy",,"x,xxx","yyyy",,,"xx,xx,",sss
>nnnn,sssss,,"xx,xx,",sss,"yyyy",
>nnnn,sssss,,,"yyyy","xx,xx,",sss

Thank you John for your response. It does seem to work fine.
I made some slight changes to your suggestion:
     Used left & right chevrons as the [] chars could appear in the text
     Used a RegEx pattern to change only matched pair of quotes to chevrons
           as your separate commands would also change orphaned quotes

So the essence of the clip is elegant, compact and entirely RegEx - just
what I wanted (earlier tests for uniqueness of temp char & chevrons not shown):
;====================================
^!Set %tc%=§
;First replace all matching double quote pairs with left & right chevron
pairs (« and »)
^!Replace "^.*?\K\"(.*?)\"" >> "«$1»" AIRSTW
^!IfError Next Else Skip_-1

;Next, replace all embedded commas between chevron pairs with the unique
temp char
^!Replace "«[^\r\n,»]*\K,(?=.*»)" >> "^%tc%" AIRSTW
^!IfError Next Else Skip_-1

;Next, delete all left & right chevrons
^!Replace "«" >> "" AIRSTW
^!Replace "»" >> "" AIRSTW

;Finally, check if there were any unmatched double quote chars remaining
^!Find "\"" AIRSTW
^!IfError Skip_1
^!Continue ###### File Error! File contains unmatched double quote char,
Continue or exit?
;====================================

Your pattern, "«[^\r\n,»]*\K,(?=.*»)" is simple, straightforward and pretty
obvious once seen. However, I don't think I would have gotten there. I was
so totally focused on jumping over the possible matching pairs without
commas that I didn't stop to analyze the pattern with commas. Duh! So,
thank you again. My scalp can heal now.


Although I fully understand your pattern I do not understand why the
greedy/not greedy specifications in mine do NOT work. It was my
understanding that greedy, meant "consume as much as possible that match"
and non-greedy meant "stop after the first matching pattern". In both cases
I expected at least one match if allowed by subsequent criteria.

However, for example,

given:           nnn,«yyyy»,,«x,xxx»,«yyyy»,,
the pattern:   («[^\,]*?»)*.*?\K«(.*?)\,(.*?)»
matches:      «yyyy»,,«x,xxx»

Why does the  "(«[^\,]*?»)*" NOT consume the "«yyyy»" and reset the match
point past it?

Further investigation revealed something not right with NTB/RegEx and why I
was losing so much hair.

It deserves its own subject line and exposition so see next post.

Art

#574 From: Art Kocsis <artkns@...>
Date: Fri Sep 28, 2012 5:48 am
Subject: NTB or RegEx Bug
artkns
Send Email Send Email
 
According the RegEx help file:

     By default, the quantifiers are "greedy", that is, they match as much
as possible (up to
     the maximum number of permitted times), without causing the rest of the
pattern to fail.

     However, if a quantifier is followed by a question mark, it ceases to
be greedy, and
     instead matches the minimum number of times possible

     With both maximizing ("greedy") and minimizing ("ungreedy" or "lazy")
repetition, failure
     of what follows normally causes the repeated item to be re-evaluated to
see if a different
     number of repeats allows the rest of the pattern to match.

Given the string:    aaaaaaaax

The pattern     matches Comment
========     ========= =====================
	 ax ax  as expected
	 a+x aaaaaaaax greedy, as expected per help
	 a*x aaaaaaaax greedy, as expected per help
	 a+?x aaaaaaaax lazy, as expected per help
	 a*?x aaaaaaaax lazy, as expected per help


However, for the string:     aaaaaaaa

The pattern     matches Comment
========     ======== =====================
	 a a  as expected
	 a+ aaaaaaaa greedy, as expected per help
	 a* nothing!!! greedy, #####  NOT per help file   ####
	 a+? a  lazy, as expected per help
	 a*? nothing  lazy, as expected per help

According to the help file, the pattern "a*" should have matched the
maximum permitted - i.e., the entire strings of "a"s. However, it stopped
at zero. Why?

Is this a bug in RegEx or in NTB? (tested both NT Std 5.8/fv & 6.2/fv)

#575 From: "John Shotsky" <jshotsky@...>
Date: Fri Sep 28, 2012 11:42 am
Subject: RE: [NTS] NTB or RegEx Bug
shotsky1
Send Email Send Email
 
I think it is working as it is supposed to. You probably don't have your cursor
in the line at the time you are testing.
If it is anywhere else, the 'zero or more' is met by a CR. If you place it
anywhere in the line, it will match from the
cursor to the end of the line, as expected.

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of Art Kocsis
Sent: Thursday, September 27, 2012 22:48
To: NoteTab-Scripts
Subject: [NTS] NTB or RegEx Bug


According the RegEx help file:

By default, the quantifiers are "greedy", that is, they match as much
as possible (up to
the maximum number of permitted times), without causing the rest of the
pattern to fail.

However, if a quantifier is followed by a question mark, it ceases to
be greedy, and
instead matches the minimum number of times possible

With both maximizing ("greedy") and minimizing ("ungreedy" or "lazy")
repetition, failure
of what follows normally causes the repeated item to be re-evaluated to
see if a different
number of repeats allows the rest of the pattern to match.

Given the string: aaaaaaaax

The pattern matches Comment
======== ========= =====================
ax ax as expected
a+x aaaaaaaax greedy, as expected per help
a*x aaaaaaaax greedy, as expected per help
a+?x aaaaaaaax lazy, as expected per help
a*?x aaaaaaaax lazy, as expected per help

However, for the string: aaaaaaaa

The pattern matches Comment
======== ======== =====================
a a as expected
a+ aaaaaaaa greedy, as expected per help
a* nothing!!! greedy, ##### NOT per help file ####
a+? a lazy, as expected per help
a*? nothing lazy, as expected per help

According to the help file, the pattern "a*" should have matched the
maximum permitted - i.e., the entire strings of "a"s. However, it stopped
at zero. Why?

Is this a bug in RegEx or in NTB? (tested both NT Std 5.8/fv & 6.2/fv)



[Non-text portions of this message have been removed]

#576 From: Art Kocsis <artkns@...>
Date: Fri Sep 28, 2012 2:20 pm
Subject: RE: [NTS] NTB or RegEx Bug
artkns
Send Email Send Email
 
aaaaaaaaaaa

I disagree that it is working as it is supposed to. Reread the doc - for
greedy it says "the maximum number", not zero. Even starting from a
previous line it should capture all of the a's.

Or better yet, add some spaces in front of the string of a's and place the
cursor within them. Again, a* captures zero chars. Zero satisfies the
condition "zero or more" but it does not satisfy the greedy condition of
"maximum possible".

Even stranger, if you place the cursor one space in front of the string of
a's and a* will capture the entire string (as it should). However, place
the cursor two or more spaces before the string and it captures zero a's
(not as it should) but it advances the cursor one position! Not as it should!

I wonder if this is one of their "optimizations" gone awry.

Art

At 9/28/2012 04:42 AM, John wrote:
>I think it is working as it is supposed to. You probably don't have your
>cursor in the line at the time you are testing.
>If it is anywhere else, the 'zero or more' is met by a CR. If you place it
>anywhere in the line, it will match from the
>cursor to the end of the line, as expected.
>
>
>From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On
>Behalf Of Art Kocsis
>Sent: Thursday, September 27, 2012 22:48
>To: NoteTab-Scripts
>Subject: [NTS] NTB or RegEx Bug
>
>According the RegEx help file:
>
>By default, the quantifiers are "greedy", that is, they match as much
>as possible (up to the maximum number of permitted times),
>without causing the rest of the pattern to fail.
>
>However, if a quantifier is followed by a question mark, it ceases to
>be greedy, and instead matches the minimum number of times possible
>
>With both maximizing ("greedy") and minimizing ("ungreedy" or "lazy")
>repetition, failure of what follows normally causes the repeated
>item to be re-evaluated to see if a different number of repeats
>allows the rest of the pattern to match.
>
>However, for the string: aaaaaaaa
>
>The pattern    matches        Comment
>========    ========     =====================
>    a                a                   as expected
>    a+              aaaaaaaa       greedy, as expected per help
>    a*               nothing!!!        greedy, ##### NOT per help file ####
>    a+?            a                   lazy, as expected per help
>    a*?             nothing          lazy, as expected per help
>
>According to the help file, the pattern "a*" should have matched the
>maximum permitted - i.e., the entire strings of "a"s. However, it stopped
>at zero. Why?
>
>Is this a bug in RegEx or in NTB? (tested both NT Std 5.8/fv & 6.2/fv)

#577 From: Eric Fookes <egroups@...>
Date: Fri Sep 28, 2012 2:33 pm
Subject: Re: [NTS] NTB or RegEx Bug
eric_fookes
Send Email Send Email
 
Hi Art,

> However, for the string:     aaaaaaaa
>
> The pattern     matches Comment
> ========     ======== =====================
>  a* nothing!!! greedy, #####  NOT per help file   ####
>
> Is this a bug in RegEx or in NTB? (tested both NT Std 5.8/fv & 6.2/fv)

Indeed, it looks like a bug in the regex engine of those NoteTab
versions. I just tested NoteTab 7 and it worked correctly. So it seems
the updated regex engine used in the recent NoteTab releases fixed the
issue.

--
Regards,

Eric Fookes
http://www.fookes.com/

#578 From: "John Shotsky" <jshotsky@...>
Date: Fri Sep 28, 2012 2:43 pm
Subject: RE: [NTS] NTB or RegEx Bug
shotsky1
Send Email Send Email
 
It may be a matter of opinion, but when a * is used, it will first evaluate the
very next character following the cursor
or the point in the line that is being evaluated. If it meets the condition, it
will stop. (In this case, the zero
condition is true, so it stops.) Thus, if the first character is a space or a
CR, it will stop. I have run into this
issue in programming before, and didn't understand why it wasn't working as I
thought it should then. Finally, it
occurred that it will start with one character, and if that character is NOT the
one looked for, it will stop. You will
notice that if you place it anywhere in the line of a's, it will capture that
one and all the rest, but none before that
point. Once understood, it becomes a feature that you can use to detect that
condition.

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of Art Kocsis
Sent: Friday, September 28, 2012 07:21
To: NoteTab-Scripts
Subject: RE: [NTS] NTB or RegEx Bug


aaaaaaaaaaa

I disagree that it is working as it is supposed to. Reread the doc - for
greedy it says "the maximum number", not zero. Even starting from a
previous line it should capture all of the a's.

Or better yet, add some spaces in front of the string of a's and place the
cursor within them. Again, a* captures zero chars. Zero satisfies the
condition "zero or more" but it does not satisfy the greedy condition of
"maximum possible".

Even stranger, if you place the cursor one space in front of the string of
a's and a* will capture the entire string (as it should). However, place
the cursor two or more spaces before the string and it captures zero a's
(not as it should) but it advances the cursor one position! Not as it should!

I wonder if this is one of their "optimizations" gone awry.

Art

At 9/28/2012 04:42 AM, John wrote:
>I think it is working as it is supposed to. You probably don't have your
>cursor in the line at the time you are testing.
>If it is anywhere else, the 'zero or more' is met by a CR. If you place it
>anywhere in the line, it will match from the
>cursor to the end of the line, as expected.
>
>
>From: ntb-scripts@yahoogroups.com <mailto:ntb-scripts%40yahoogroups.com> 
[mailto:ntb-scripts@yahoogroups.com
<mailto:ntb-scripts%40yahoogroups.com> ] On
>Behalf Of Art Kocsis
>Sent: Thursday, September 27, 2012 22:48
>To: NoteTab-Scripts
>Subject: [NTS] NTB or RegEx Bug
>
>According the RegEx help file:
>
>By default, the quantifiers are "greedy", that is, they match as much
>as possible (up to the maximum number of permitted times),
>without causing the rest of the pattern to fail.
>
>However, if a quantifier is followed by a question mark, it ceases to
>be greedy, and instead matches the minimum number of times possible
>
>With both maximizing ("greedy") and minimizing ("ungreedy" or "lazy")
>repetition, failure of what follows normally causes the repeated
>item to be re-evaluated to see if a different number of repeats
>allows the rest of the pattern to match.
>
>However, for the string: aaaaaaaa
>
>The pattern matches Comment
>======== ======== =====================
> a a as expected
> a+ aaaaaaaa greedy, as expected per help
> a* nothing!!! greedy, ##### NOT per help file ####
> a+? a lazy, as expected per help
> a*? nothing lazy, as expected per help
>
>According to the help file, the pattern "a*" should have matched the
>maximum permitted - i.e., the entire strings of "a"s. However, it stopped
>at zero. Why?
>
>Is this a bug in RegEx or in NTB? (tested both NT Std 5.8/fv & 6.2/fv)



[Non-text portions of this message have been removed]

#579 From: "flo.gehrke" <flo.gehrke@...>
Date: Fri Sep 28, 2012 2:56 pm
Subject: Re: NTB or RegEx Bug
flo.gehrke
Send Email Send Email
 
--- In ntb-scripts@yahoogroups.com, Art Kocsis <artkns@...> wrote:
>
> However, for the string:     aaaaaaaa
> a* nothing!!!
>
> Is this a bug in RegEx or in NTB? (tested both NT
> Std 5.8/fv & 6.2/fv)

No bug -- neither in PCRE nor in NTb.

'a*' is equivalent to a{0,}. So, at the beginning of the subject string, the
engine achieves a match of zero length because in...

However...aaa

it finds a 'H' at the beginning, i.e. the absence of 'a'. Since this doesn't
consume any character you don't see it.

Try this: Open the Find dialog, enter 'a*', and close it again. Now press F3
repeatedly and watch the cursor moving forward. It stops at any position where
the pattern is true, i.e. where an 'a' is absent or where the engine doesn't see
an 'a' when looking to the right. Ntb will match 'aaa' as soon as the engine
reaches that string.

There is an PCRE_NotEmpty match option that changes this behavior. You can test
this with the Workbench for DIRegEx (the embedding of PCRE into Ntb). But there
is no way to activate that option in Ntb. When choosing that option, the engine
immediately selects 'aaa'.

Regards,
Flo

#580 From: Art Kocsis <artkns@...>
Date: Sat Sep 29, 2012 12:09 am
Subject: Re: [NTS] Re: NTB or RegEx Bug
artkns
Send Email Send Email
 
Yes, a zero length match satisfies a{0,}. So does all of other the
remaining partial matches. But the  whole point is that a* is supposed to
be greedy. Look at the definition of greedy - it clearly specifies the
MAXIMUM match, not the minimum. A zero length match is the minimum not the
maximum possibility. The question mark "lazy" metacharacter specifies the
minimum match. There should  be a difference between greedy and non-greedy.
Even Eric agreed.

Also, a non-match should not move the cursor. Try any other search that
fails and
observe the status bar cursor location. It does not move. Contriwise, a
normal search starts at the current cursor position and searches forward
(to the end of the file if necessary), looking for a possible match. a*
doesn't seem to want to get off its duff to even start. [Does that mean
that a* is lazy? <g>]

BTW - You don't have to go thru all the hassle of defining, closing and
reopening the Find window. Clicking on Find Next works quite well. As does
F3 with the window open.

BTW2 - Thanks for the heads up on DIRegEx. I will look at it. I assume that
is the one you use. Is it freeware or shareware? I can't find any
registration info on the site. What other RegEx apps have you tried?
[http://www.yunqa.de/delphi/doku.php/products/regex/index#diregex]

Art

At 9/28/2012 07:56 AM, Flo wrote:
>--- In <mailto:ntb-scripts%40yahoogroups.com>ntb-scripts@yahoogroups.com,
>Art Kocsis <artkns@...> wrote:
> >
> > However, for the string: aaaaaaaa
> > a* nothing!!!
> >
> > Is this a bug in RegEx or in NTB? (tested both NT
> > Std 5.8/fv & 6.2/fv)
>
>No bug -- neither in PCRE nor in NTb.
>
>'a*' is equivalent to a{0,}. So, at the beginning of the subject string,
>the engine achieves a match of zero length because in...
>
>However...aaa
>
>it finds a 'H' at the beginning, i.e. the absence of 'a'. Since this
>doesn't consume any character you don't see it.
>
>Try this: Open the Find dialog, enter 'a*', and close it again. Now press
>F3 repeatedly and watch the cursor moving forward. It stops at any
>position where the pattern is true, i.e. where an 'a' is absent or where
>the engine doesn't see an 'a' when looking to the right. Ntb will match
>'aaa' as soon as the engine reaches that string.
>
>There is an PCRE_NotEmpty match option that changes this behavior. You can
>test this with the Workbench for DIRegEx (the embedding of PCRE into Ntb).
>But there is no way to activate that option in Ntb. When choosing that
>option, the engine immediately selects 'aaa'.
>
>Regards,
>Flo

#581 From: "flo.gehrke" <flo.gehrke@...>
Date: Sat Sep 29, 2012 1:51 pm
Subject: [NTS] Re: NTB or RegEx Bug
flo.gehrke
Send Email Send Email
 
I understand that you are testing a subject string like...

However ... aaa ...

starting from the beginning of the line. So 'aaa' is not at the start of line
but somewhere behind 'However...'.

> Yes, a zero length match satisfies a{0,}

OK, so the difference between "no match" and "match of zero length (or 'zero
match') can be assumed as clear.

> But the  whole point is that a* is supposed to
> be greedy.

No doubt, but a "sequence of zero matches" at the same position is not
imaginable -- so greedyness doesn't matter here at the first positions.
Greedyness matters the first time when the engine is reaching 'aaa'. At that
position only, the pattern matches all 'a' since it is greedy.

> Also, a non-match should not move the cursor.

Possibly, there are two misunderstandings here: 1. Again, the engine doesn't
achieve "non-matches" but matches of zero length. 2. The cursor isn't moved here
by "non-matches" but by repeatedly re-starting Find.

> Contriwise, a normal search starts at the current cursor
> position and searches forward (to the end of the file if
> necessary)

No doubt, and that's exactly what the engine does, even in this case. Compare it
with...

^!Replace "a*" >> "!" WARS

tested against 'xxxxxx'. The result is '!x!x!x!x!x!x!' -- i.e., the engine
achieves seven zero matches at any position where ' zero a' is true.

Another question is: Is there a work-around in Ntb for that issue?

You could use 'a*a' instead of 'a*'. In this case, you are forcing the engine to
find zero or more 'a' being followed by another 'a'. Thus the zero matches don't
act as a "brake" any more, and the engine will immediately find and select
'aaa'.

> Thanks for the heads up on DIRegEx. I will look at it. I assume
> that is the one you use. Is it freeware or shareware?

For a short time, it was available for betatesters but now the link doesn't work
any more.

> What other RegEx apps have you tried?

So far, I don't know any app that fully supports PCRE -- except that workbench
and Ntb itself. Sometimes helpful is http://weitz.de/regex-coach/ but it isn't
updated to the latest version of PCRE and doesn't support some PCRE features
either. I think that's the same problem with RegexBuddy which has often been
recommended by Alec Burgess. Please correct me if I'm wrong.

Regards,
Flo

--- In ntb-scripts@yahoogroups.com, Art Kocsis <artkns@...> wrote:
>
> Yes, a zero length match satisfies a{0,}. So does all of other the
> remaining partial matches. But the  whole point is that a* is supposed to
> be greedy. Look at the definition of greedy - it clearly specifies the
> MAXIMUM match, not the minimum. A zero length match is the minimum not the
> maximum possibility. The question mark "lazy" metacharacter specifies the
> minimum match. There should  be a difference between greedy and non-greedy.
> Even Eric agreed.
>
> Also, a non-match should not move the cursor. Try any other search that
> fails and
> observe the status bar cursor location. It does not move. Contriwise, a
> normal search starts at the current cursor position and searches forward
> (to the end of the file if necessary), looking for a possible match. a*
> doesn't seem to want to get off its duff to even start. [Does that mean
> that a* is lazy? <g>]
>
> BTW - You don't have to go thru all the hassle of defining, closing and
> reopening the Find window. Clicking on Find Next works quite well. As does
> F3 with the window open.
>
> BTW2 - Thanks for the heads up on DIRegEx. I will look at it. I assume that
> is the one you use. Is it freeware or shareware? I can't find any
> registration info on the site. What other RegEx apps have you tried?
> [http://www.yunqa.de/delphi/doku.php/products/regex/index#diregex]
>
> Art
>
> At 9/28/2012 07:56 AM, Flo wrote:
> >--- In <mailto:ntb-scripts%40yahoogroups.com>ntb-scripts@yahoogroups.com,
> >Art Kocsis <artkns@> wrote:
> > >
> > > However, for the string: aaaaaaaa
> > > a* nothing!!!
> > >
> > > Is this a bug in RegEx or in NTB? (tested both NT
> > > Std 5.8/fv & 6.2/fv)
> >
> >No bug -- neither in PCRE nor in NTb.
> >
> >'a*' is equivalent to a{0,}. So, at the beginning of the subject string,
> >the engine achieves a match of zero length because in...
> >
> >However...aaa
> >
> >it finds a 'H' at the beginning, i.e. the absence of 'a'. Since this
> >doesn't consume any character you don't see it.
> >
> >Try this: Open the Find dialog, enter 'a*', and close it again. Now press
> >F3 repeatedly and watch the cursor moving forward. It stops at any
> >position where the pattern is true, i.e. where an 'a' is absent or where
> >the engine doesn't see an 'a' when looking to the right. Ntb will match
> >'aaa' as soon as the engine reaches that string.
> >
> >There is an PCRE_NotEmpty match option that changes this behavior. You can
> >test this with the Workbench for DIRegEx (the embedding of PCRE into Ntb).
> >But there is no way to activate that option in Ntb. When choosing that
> >option, the engine immediately selects 'aaa'.
> >
> >Regards,
> >Flo
>

#582 From: Axel Berger <Axel-Berger@...>
Date: Sat Sep 29, 2012 2:07 pm
Subject: Re: [NTS] Re: NTB or RegEx Bug
absalom_nemini
Send Email Send Email
 
"flo.gehrke" wrote:
> You could use 'a*a' instead of 'a*'.

I think in practice a star quantifier on its own is meaningless, there
must be at least one other thing in the pattern. If you're interested in
one character only, then the quantifier has to be at least "+". All
Art's "a*x" examples worked fine and so do all cases where I use the "*"
or "?" quantifier.
What can the possible use be for something that matches anywhere in
anything?

Axel

#583 From: "flo.gehrke" <flo.gehrke@...>
Date: Sun Sep 30, 2012 12:39 pm
Subject: [NTS] Re: NTB or RegEx Bug
flo.gehrke
Send Email Send Email
 
--- In ntb-scripts@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
>
> "flo.gehrke" wrote:
> > You could use 'a*a' instead of 'a*'.
> I think in practice a star quantifier on its own is
> meaningless, there must be at least one other thing in the
> pattern...What can the possible use be for something that
> matches anywhere in anything?

In this context, I understood that 'a' is just an element that, "in practice",
would primarily represent an element in a more complex pattern. In this respect,
I agree with the objection you made.

In order to match just a sequence of literal 'a', a pattern like 'a*a' wouldn't
make much sense, indeed. And 'a{1,}' or 'a+' would certainly be more appropriate
solutions.

But to prevent any misunderstanding among beginners, we should stress that
something like 'a*' is not at all useless under ANY circumstances. Quite often,
we have to define that an element 'a' is there or it is not there.

For example: (?<=<xxx>)\d*(?=</xxx>) matching the position between '>' and '<'
in strings like...

<xxx>12</xxx>
<xxx></xxx>
<xxx>9</xxx>

no matter if there is a number or no number.

Regards,
Flo

#584 From: "John Shotsky" <jshotsky@...>
Date: Sun Sep 30, 2012 1:11 pm
Subject: RE: [NTS] Re: NTB or RegEx Bug
shotsky1
Send Email Send Email
 
I agree, and use the star heavily in my clip libraries.

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of flo.gehrke
Sent: Sunday, September 30, 2012 05:39
To: ntb-scripts@yahoogroups.com
Subject: [NTS] Re: NTB or RegEx Bug


--- In ntb-scripts@yahoogroups.com <mailto:ntb-scripts%40yahoogroups.com> , Axel
Berger <Axel-Berger@...> wrote:
>
> "flo.gehrke" wrote:
> > You could use 'a*a' instead of 'a*'.
> I think in practice a star quantifier on its own is
> meaningless, there must be at least one other thing in the
> pattern...What can the possible use be for something that
> matches anywhere in anything?

In this context, I understood that 'a' is just an element that, "in practice",
would primarily represent an element in a
more complex pattern. In this respect, I agree with the objection you made.

In order to match just a sequence of literal 'a', a pattern like 'a*a' wouldn't
make much sense, indeed. And 'a{1,}' or
'a+' would certainly be more appropriate solutions.

But to prevent any misunderstanding among beginners, we should stress that
something like 'a*' is not at all useless
under ANY circumstances. Quite often, we have to define that an element 'a' is
there or it is not there.

For example: (?<=<xxx>)\d*(?=</xxx>) matching the position between '>' and '<'
in strings like...

<xxx>12</xxx>
<xxx></xxx>
<xxx>9</xxx>

no matter if there is a number or no number.

Regards,
Flo



[Non-text portions of this message have been removed]

#585 From: "flo.gehrke" <flo.gehrke@...>
Date: Sun Sep 30, 2012 3:31 pm
Subject: Re: [NTS] Finding Pairwise Matches
flo.gehrke
Send Email Send Email
 
--- In ntb-scripts@yahoogroups.com, Art Kocsis <artkns@...> wrote:

> given:           nnn,«yyyy»,,«x,xxx»,«yyyy»,,
> the pattern:   («[^\,]*?»)*.*?\K«(.*?)\,(.*?)»
> matches:      «yyyy»,,«x,xxx»
>
> Why does the  "(«[^\,]*?»)*" NOT consume the "«yyyy»" and reset the
> match point past it?

A single '«[^\,]*?»' or even '«[^,]+»' (no need to escape comma in character
class) would match that '«yyyy»' section but your RegEx is demanding more than
that.

In short, the engine starts testing '(«[^\,]*?»)*.*?\K«'. Your subject string,
however, starts with 'nnn...'. So the engine doesn't achieve any submatch until
it's testing '.*?\K«'. Now backtracking to '.' it matches 'nnn,«' because each
character is matched with the dot. So 'nnn,' is not skipped, and it goes on till
'nnn,«yyyy»,,«x,xxx»' is matched in the end.

BTW, for me, a simple clip like...

^!Jump Doc_Start
:Loop
^!Find ""[^"]+"" RS
^!IfError End
^!IfMatch "[^,]+" "^$GetSelection$" Skip
^!InsertText ""^$StrReplace(,;;^$GetSelection$;A)$""
^!Goto Loop

(designed for Ntb 7.0) would perfectly do the job (removing commas between
opening and closing brackets) when run against your sample string...

nnnnnnnnn,"xxxx",,,"ss,ss,",xxx
nnnnnnnnn,"xx,xx",,"ss,ss",xxx
nnnnnnnnn,xxxx,,"ss,ss,"xxx

Even...

^!Jump Doc_Start
^!Find ""\w+,(\w|,)*"" RS
^!IfError End
^!InsertText ""^$StrReplace(,;;^$GetSelection$;A)$""
^!Goto Skip_-3

would do it if there were no more variations (?) in the string.

Members being at war with RegEx will be happy to see that they could even find a
solution without any RegEx at all:

^!Jump Doc_Start
:Loop
^!Find """ RS
^!IfError End
^!MoveCursor +1
^!Keyboard CTRL+M &50
^!IfFalse ^$StrPos(",";"^$GetSelection$";0)$ Skip
^!InsertSelect "^$StrReplace(",";"";"^$GetSelection$";A)$"
^!Jump Select_End
^!Goto Loop

Regards,
Flo

#586 From: "EB" <ea_young@...>
Date: Wed Oct 10, 2012 1:00 am
Subject: How to search a QUESTION MARK in a regular expression
ea_young
Send Email Send Email
 
I have a bunch of lines in a file that look something like this:

Ga53t76lah/Z3vdeg14/V0c2/?freds=OTvvA4OfhbfjI6MQ,0,0,0,&p=31

I want to strip everything to the right of and including the QUESTION MARK using
a regular expression

    / ?fred(.*)p=31   did not work as a search string in FIND.

It did work if I removed the question mark - but that left the question mark in
place after the REPLACE.

I can - turning the regular expression block off - FIND/REPLACE the question
mark with a less tricky character and then run (with RegEx on) the FIND/REPLACE
to get rid of everything to the right of and including the new character - but
that is additional steps.

Is there a way to identify the question mark as a question mark instead of
special character within the original RegEx FIND/REPLACE? (The [:?:] construct
did not seem to work.)

Thank you.

#587 From: "John Shotsky" <jshotsky@...>
Date: Wed Oct 10, 2012 1:11 am
Subject: RE: [NTS] How to search a QUESTION MARK in a regular expression
shotsky1
Send Email Send Email
 
\?

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of EB
Sent: Tuesday, October 09, 2012 18:01
To: ntb-scripts@yahoogroups.com
Subject: [NTS] How to search a QUESTION MARK in a regular expression


I have a bunch of lines in a file that look something like this:

Ga53t76lah/Z3vdeg14/V0c2/?freds=OTvvA4OfhbfjI6MQ,0,0,0,&p=31

I want to strip everything to the right of and including the QUESTION MARK using
a regular expression

/ ?fred(.*)p=31 did not work as a search string in FIND.

It did work if I removed the question mark - but that left the question mark in
place after the REPLACE.

I can - turning the regular expression block off - FIND/REPLACE the question
mark with a less tricky character and then
run (with RegEx on) the FIND/REPLACE to get rid of everything to the right of
and including the new character - but that
is additional steps.

Is there a way to identify the question mark as a question mark instead of
special character within the original RegEx
FIND/REPLACE? (The [:?:] construct did not seem to work.)

Thank you.



[Non-text portions of this message have been removed]

#588 From: Robert Bull <barlennan@...>
Date: Wed Oct 10, 2012 2:01 pm
Subject: Re: [NTS] How to search a QUESTION MARK in a regular expression
barlennan
Send Email Send Email
 
Wednesday, October 10, 2012, 2:00:55 AM, EB wrote:

E>    / ?fred(.*)p=31   did not work as a search string in FIND.

I think John's right to suggest trying "\?". In regular expressions,
"?" is a metacharacter that means the character to its left may or may
not be present. Quoting The AWK Programming Language, "(r)? matches
the null string or any string matched by r ..." Ergo, you probably
need to "escape" a "?" with a backslash in front of it.

--
Regards,

Robert Bull
   mailto:barlennan@...

#589 From: "John Shotsky" <jshotsky@...>
Date: Wed Oct 10, 2012 2:13 pm
Subject: RE: [NTS] How to search a QUESTION MARK in a regular expression
shotsky1
Send Email Send Email
 
I use this all the time to find ends of sentences:
(\.|\?|!)(\R|\x20|\)|\")
It could also be done using classes (which don't need escapes except for a few
items):
[?.!]+[")\x20]+

Regards,
John
RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/

From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf
Of Robert Bull
Sent: Wednesday, October 10, 2012 07:02
To: EB
Subject: Re: [NTS] How to search a QUESTION MARK in a regular expression


Wednesday, October 10, 2012, 2:00:55 AM, EB wrote:

E> / ?fred(.*)p=31 did not work as a search string in FIND.

I think John's right to suggest trying "\?". In regular expressions,
"?" is a metacharacter that means the character to its left may or may
not be present. Quoting The AWK Programming Language, "(r)? matches
the null string or any string matched by r ..." Ergo, you probably
need to "escape" a "?" with a backslash in front of it.

--
Regards,

Robert Bull
mailto:barlennan@... <mailto:barlennan%40yahoo.co.uk>



[Non-text portions of this message have been removed]

#590 From: "flo.gehrke" <flo.gehrke@...>
Date: Wed Oct 10, 2012 3:40 pm
Subject: Re: [NTS] How to search a QUESTION MARK in a regular expression
flo.gehrke
Send Email Send Email
 
--- In ntb-scripts@yahoogroups.com, Robert Bull <barlennan@...> wrote:
>
> Ergo, you probably need to "escape" a "?" with a backslash
> in front of it.

Why "probably"? It's a PCRE rule, and John is undoubtedly right. See "Backslash"
in the Help on Regex:

> If (the backslash) is followed by a character that is not a
> number or a letter, it takes away any special meaning that
> character may have. This use of backslash as an escape character
> applies both inside and outside character classes.

An alternative is to search the '?' in hex '\x3F' or octal '\077'.

Regards,
Flo

#591 From: "EB" <ea_young@...>
Date: Wed Oct 10, 2012 3:14 pm
Subject: Re: [NTS] How to search a QUESTION MARK in a regular expression
ea_young
Send Email Send Email
 
As is so often the case, I overlooked the obvious.  I thought I had tried pretty
much everything - but apparently not. The \ seems to work.  THANK YOU!

--- In ntb-scripts@yahoogroups.com, "John Shotsky" <jshotsky@...> wrote:
>
> \?
>
> Regards,
> John
> RecipeTools Web Site:  <http://recipetools.gotdns.com/>
http://recipetools.gotdns.com/
>
> From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On
Behalf Of EB
> Sent: Tuesday, October 09, 2012 18:01
> To: ntb-scripts@yahoogroups.com
> Subject: [NTS] How to search a QUESTION MARK in a regular expression
>
>
> I have a bunch of lines in a file that look something like this:
>
> Ga53t76lah/Z3vdeg14/V0c2/?freds=OTvvA4OfhbfjI6MQ,0,0,0,&p=31
>
> I want to strip everything to the right of and including the QUESTION MARK
using a regular expression
>
> / ?fred(.*)p=31 did not work as a search string in FIND.
>
> It did work if I removed the question mark - but that left the question mark
in place after the REPLACE.
>
> I can - turning the regular expression block off - FIND/REPLACE the question
mark with a less tricky character and then
> run (with RegEx on) the FIND/REPLACE to get rid of everything to the right of
and including the new character - but that
> is additional steps.
>
> Is there a way to identify the question mark as a question mark instead of
special character within the original RegEx
> FIND/REPLACE? (The [:?:] construct did not seem to work.)
>
> Thank you.
>
>
>
> [Non-text portions of this message have been removed]
>

Messages 562 - 591 of 591   Oldest  |  < Older  |  Newer >  |  Newest
Add to My Yahoo!      XML What's This?

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help