Filter elements with a regex, only if they are in a certain block
up vote
2
down vote
favorite
In a string (in reality it's much bigger):
s = """
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element 12 38
...
Element 198 38
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element 198 38
EndB
"""
how to replace every Element <anythinghere> 38
which is inside a BeginB...EndB
block (and only those!) by Element ABC
?
I was trying with:
s = re.sub(r'Element .* 38', 'Element ABC', s)
but this doesn't detect if it's in a BeginB...EndB
block. How to do this?
python regex string
add a comment |
up vote
2
down vote
favorite
In a string (in reality it's much bigger):
s = """
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element 12 38
...
Element 198 38
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element 198 38
EndB
"""
how to replace every Element <anythinghere> 38
which is inside a BeginB...EndB
block (and only those!) by Element ABC
?
I was trying with:
s = re.sub(r'Element .* 38', 'Element ABC', s)
but this doesn't detect if it's in a BeginB...EndB
block. How to do this?
python regex string
Your code is actually working. I don't see how the output is different from what you want.
– ninesalt
Nov 8 at 17:08
@ninesalt I want to replace only the elements which are inside aBeginB...EndB
block, not those which are inBeginA...EndA
blocks.
– Basj
Nov 8 at 17:10
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
In a string (in reality it's much bigger):
s = """
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element 12 38
...
Element 198 38
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element 198 38
EndB
"""
how to replace every Element <anythinghere> 38
which is inside a BeginB...EndB
block (and only those!) by Element ABC
?
I was trying with:
s = re.sub(r'Element .* 38', 'Element ABC', s)
but this doesn't detect if it's in a BeginB...EndB
block. How to do this?
python regex string
In a string (in reality it's much bigger):
s = """
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element 12 38
...
Element 198 38
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element 198 38
EndB
"""
how to replace every Element <anythinghere> 38
which is inside a BeginB...EndB
block (and only those!) by Element ABC
?
I was trying with:
s = re.sub(r'Element .* 38', 'Element ABC', s)
but this doesn't detect if it's in a BeginB...EndB
block. How to do this?
python regex string
python regex string
asked Nov 8 at 17:03
Basj
5,23028102218
5,23028102218
Your code is actually working. I don't see how the output is different from what you want.
– ninesalt
Nov 8 at 17:08
@ninesalt I want to replace only the elements which are inside aBeginB...EndB
block, not those which are inBeginA...EndA
blocks.
– Basj
Nov 8 at 17:10
add a comment |
Your code is actually working. I don't see how the output is different from what you want.
– ninesalt
Nov 8 at 17:08
@ninesalt I want to replace only the elements which are inside aBeginB...EndB
block, not those which are inBeginA...EndA
blocks.
– Basj
Nov 8 at 17:10
Your code is actually working. I don't see how the output is different from what you want.
– ninesalt
Nov 8 at 17:08
Your code is actually working. I don't see how the output is different from what you want.
– ninesalt
Nov 8 at 17:08
@ninesalt I want to replace only the elements which are inside a
BeginB...EndB
block, not those which are in BeginA...EndA
blocks.– Basj
Nov 8 at 17:10
@ninesalt I want to replace only the elements which are inside a
BeginB...EndB
block, not those which are in BeginA...EndA
blocks.– Basj
Nov 8 at 17:10
add a comment |
2 Answers
2
active
oldest
votes
up vote
2
down vote
accepted
Use two expressions:
block = re.compile(r'BeginB[sS]+?EndB')
element = re.compile(r'Element.*?b38b')
def repl(match):
return element.sub('Element ABC', match.group(0))
nstring = block.sub(repl, string)
print(nstring)
This yields
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element ABC
...
Element ABC
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element ABC
EndB
See a demo on ideone.com.
Without re.compile
(just to get the idea):
def repl(match):
return re.sub(r'Element.*?b38b', 'Element ABC', match.group(0))
print re.sub(r'BeginB[sS]+?EndB', repl, s)
The important idea here is the fact that re.sub
's second parameter can be a function.
You could very well do it without a function but you'd need the newer regex
module which supports G
and K
:
rx = re.compile(r'''
(?:G(?!A)|BeginB)
(?:(?!EndB)[sS])+?K
Element.+?b38b''', re.VERBOSE)
string = rx.sub('Element ABC', string)
print(string)
See another demo for this one on regex101.com as well.
Wonderful, I forgot that we could use a function as the second parameter ofre.sub
! I edited your answer to add these details, I hope you don't mind @Jan.
– Basj
Nov 8 at 22:19
add a comment |
up vote
2
down vote
Try the following:
r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)'
You can test it here.
For your example, I would echo @Jan's answer and use two separate regular expressions:
import re
restrict = re.compile(r'(?s)(?<=BeginB).*?(?=EndB)')
pattern = re.compile(r'Elements+(d+)s+38')
def repl(block):
return pattern.sub('Element ABC', block.group(0))
out = restrict.sub(repl, s)
How do you do the replace?s = re.sub(r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)', r'Element ABC', s)
doesn't work directly, so I guess we should modify the replacement stringr'Element ABC'
, how?
– Basj
Nov 8 at 17:28
Ok thanks @rahlf23 !
– Basj
Nov 8 at 17:36
To clarify, from your example, you would want to replace12
,198
and12
again correct?
– rahlf23
Nov 8 at 17:37
Yes indeed. All the lines inside a BeginB...EndB block which are of the formElement .... 38
. IRL I have a few (but not many) BeginB...EndB blocks, thousands of elements inside them, and other blocks BeginA...EndA, BeginC...EndC, etc.
– Basj
Nov 8 at 17:40
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Use two expressions:
block = re.compile(r'BeginB[sS]+?EndB')
element = re.compile(r'Element.*?b38b')
def repl(match):
return element.sub('Element ABC', match.group(0))
nstring = block.sub(repl, string)
print(nstring)
This yields
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element ABC
...
Element ABC
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element ABC
EndB
See a demo on ideone.com.
Without re.compile
(just to get the idea):
def repl(match):
return re.sub(r'Element.*?b38b', 'Element ABC', match.group(0))
print re.sub(r'BeginB[sS]+?EndB', repl, s)
The important idea here is the fact that re.sub
's second parameter can be a function.
You could very well do it without a function but you'd need the newer regex
module which supports G
and K
:
rx = re.compile(r'''
(?:G(?!A)|BeginB)
(?:(?!EndB)[sS])+?K
Element.+?b38b''', re.VERBOSE)
string = rx.sub('Element ABC', string)
print(string)
See another demo for this one on regex101.com as well.
Wonderful, I forgot that we could use a function as the second parameter ofre.sub
! I edited your answer to add these details, I hope you don't mind @Jan.
– Basj
Nov 8 at 22:19
add a comment |
up vote
2
down vote
accepted
Use two expressions:
block = re.compile(r'BeginB[sS]+?EndB')
element = re.compile(r'Element.*?b38b')
def repl(match):
return element.sub('Element ABC', match.group(0))
nstring = block.sub(repl, string)
print(nstring)
This yields
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element ABC
...
Element ABC
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element ABC
EndB
See a demo on ideone.com.
Without re.compile
(just to get the idea):
def repl(match):
return re.sub(r'Element.*?b38b', 'Element ABC', match.group(0))
print re.sub(r'BeginB[sS]+?EndB', repl, s)
The important idea here is the fact that re.sub
's second parameter can be a function.
You could very well do it without a function but you'd need the newer regex
module which supports G
and K
:
rx = re.compile(r'''
(?:G(?!A)|BeginB)
(?:(?!EndB)[sS])+?K
Element.+?b38b''', re.VERBOSE)
string = rx.sub('Element ABC', string)
print(string)
See another demo for this one on regex101.com as well.
Wonderful, I forgot that we could use a function as the second parameter ofre.sub
! I edited your answer to add these details, I hope you don't mind @Jan.
– Basj
Nov 8 at 22:19
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Use two expressions:
block = re.compile(r'BeginB[sS]+?EndB')
element = re.compile(r'Element.*?b38b')
def repl(match):
return element.sub('Element ABC', match.group(0))
nstring = block.sub(repl, string)
print(nstring)
This yields
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element ABC
...
Element ABC
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element ABC
EndB
See a demo on ideone.com.
Without re.compile
(just to get the idea):
def repl(match):
return re.sub(r'Element.*?b38b', 'Element ABC', match.group(0))
print re.sub(r'BeginB[sS]+?EndB', repl, s)
The important idea here is the fact that re.sub
's second parameter can be a function.
You could very well do it without a function but you'd need the newer regex
module which supports G
and K
:
rx = re.compile(r'''
(?:G(?!A)|BeginB)
(?:(?!EndB)[sS])+?K
Element.+?b38b''', re.VERBOSE)
string = rx.sub('Element ABC', string)
print(string)
See another demo for this one on regex101.com as well.
Use two expressions:
block = re.compile(r'BeginB[sS]+?EndB')
element = re.compile(r'Element.*?b38b')
def repl(match):
return element.sub('Element ABC', match.group(0))
nstring = block.sub(repl, string)
print(nstring)
This yields
BeginA
Qwerty
Element 11 35
EndA
BeginB
Element ABC
...
Element ABC
EndB
BeginA
Element 81132 38
SomethingElse
EndA
BeginB
Element 12 39
Element ABC
EndB
See a demo on ideone.com.
Without re.compile
(just to get the idea):
def repl(match):
return re.sub(r'Element.*?b38b', 'Element ABC', match.group(0))
print re.sub(r'BeginB[sS]+?EndB', repl, s)
The important idea here is the fact that re.sub
's second parameter can be a function.
You could very well do it without a function but you'd need the newer regex
module which supports G
and K
:
rx = re.compile(r'''
(?:G(?!A)|BeginB)
(?:(?!EndB)[sS])+?K
Element.+?b38b''', re.VERBOSE)
string = rx.sub('Element ABC', string)
print(string)
See another demo for this one on regex101.com as well.
edited Nov 9 at 7:42
answered Nov 8 at 18:19
Jan
24k52347
24k52347
Wonderful, I forgot that we could use a function as the second parameter ofre.sub
! I edited your answer to add these details, I hope you don't mind @Jan.
– Basj
Nov 8 at 22:19
add a comment |
Wonderful, I forgot that we could use a function as the second parameter ofre.sub
! I edited your answer to add these details, I hope you don't mind @Jan.
– Basj
Nov 8 at 22:19
Wonderful, I forgot that we could use a function as the second parameter of
re.sub
! I edited your answer to add these details, I hope you don't mind @Jan.– Basj
Nov 8 at 22:19
Wonderful, I forgot that we could use a function as the second parameter of
re.sub
! I edited your answer to add these details, I hope you don't mind @Jan.– Basj
Nov 8 at 22:19
add a comment |
up vote
2
down vote
Try the following:
r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)'
You can test it here.
For your example, I would echo @Jan's answer and use two separate regular expressions:
import re
restrict = re.compile(r'(?s)(?<=BeginB).*?(?=EndB)')
pattern = re.compile(r'Elements+(d+)s+38')
def repl(block):
return pattern.sub('Element ABC', block.group(0))
out = restrict.sub(repl, s)
How do you do the replace?s = re.sub(r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)', r'Element ABC', s)
doesn't work directly, so I guess we should modify the replacement stringr'Element ABC'
, how?
– Basj
Nov 8 at 17:28
Ok thanks @rahlf23 !
– Basj
Nov 8 at 17:36
To clarify, from your example, you would want to replace12
,198
and12
again correct?
– rahlf23
Nov 8 at 17:37
Yes indeed. All the lines inside a BeginB...EndB block which are of the formElement .... 38
. IRL I have a few (but not many) BeginB...EndB blocks, thousands of elements inside them, and other blocks BeginA...EndA, BeginC...EndC, etc.
– Basj
Nov 8 at 17:40
add a comment |
up vote
2
down vote
Try the following:
r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)'
You can test it here.
For your example, I would echo @Jan's answer and use two separate regular expressions:
import re
restrict = re.compile(r'(?s)(?<=BeginB).*?(?=EndB)')
pattern = re.compile(r'Elements+(d+)s+38')
def repl(block):
return pattern.sub('Element ABC', block.group(0))
out = restrict.sub(repl, s)
How do you do the replace?s = re.sub(r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)', r'Element ABC', s)
doesn't work directly, so I guess we should modify the replacement stringr'Element ABC'
, how?
– Basj
Nov 8 at 17:28
Ok thanks @rahlf23 !
– Basj
Nov 8 at 17:36
To clarify, from your example, you would want to replace12
,198
and12
again correct?
– rahlf23
Nov 8 at 17:37
Yes indeed. All the lines inside a BeginB...EndB block which are of the formElement .... 38
. IRL I have a few (but not many) BeginB...EndB blocks, thousands of elements inside them, and other blocks BeginA...EndA, BeginC...EndC, etc.
– Basj
Nov 8 at 17:40
add a comment |
up vote
2
down vote
up vote
2
down vote
Try the following:
r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)'
You can test it here.
For your example, I would echo @Jan's answer and use two separate regular expressions:
import re
restrict = re.compile(r'(?s)(?<=BeginB).*?(?=EndB)')
pattern = re.compile(r'Elements+(d+)s+38')
def repl(block):
return pattern.sub('Element ABC', block.group(0))
out = restrict.sub(repl, s)
Try the following:
r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)'
You can test it here.
For your example, I would echo @Jan's answer and use two separate regular expressions:
import re
restrict = re.compile(r'(?s)(?<=BeginB).*?(?=EndB)')
pattern = re.compile(r'Elements+(d+)s+38')
def repl(block):
return pattern.sub('Element ABC', block.group(0))
out = restrict.sub(repl, s)
edited Nov 8 at 19:08
answered Nov 8 at 17:24
rahlf23
4,8501629
4,8501629
How do you do the replace?s = re.sub(r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)', r'Element ABC', s)
doesn't work directly, so I guess we should modify the replacement stringr'Element ABC'
, how?
– Basj
Nov 8 at 17:28
Ok thanks @rahlf23 !
– Basj
Nov 8 at 17:36
To clarify, from your example, you would want to replace12
,198
and12
again correct?
– rahlf23
Nov 8 at 17:37
Yes indeed. All the lines inside a BeginB...EndB block which are of the formElement .... 38
. IRL I have a few (but not many) BeginB...EndB blocks, thousands of elements inside them, and other blocks BeginA...EndA, BeginC...EndC, etc.
– Basj
Nov 8 at 17:40
add a comment |
How do you do the replace?s = re.sub(r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)', r'Element ABC', s)
doesn't work directly, so I guess we should modify the replacement stringr'Element ABC'
, how?
– Basj
Nov 8 at 17:28
Ok thanks @rahlf23 !
– Basj
Nov 8 at 17:36
To clarify, from your example, you would want to replace12
,198
and12
again correct?
– rahlf23
Nov 8 at 17:37
Yes indeed. All the lines inside a BeginB...EndB block which are of the formElement .... 38
. IRL I have a few (but not many) BeginB...EndB blocks, thousands of elements inside them, and other blocks BeginA...EndA, BeginC...EndC, etc.
– Basj
Nov 8 at 17:40
How do you do the replace?
s = re.sub(r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)', r'Element ABC', s)
doesn't work directly, so I guess we should modify the replacement string r'Element ABC'
, how?– Basj
Nov 8 at 17:28
How do you do the replace?
s = re.sub(r'(?s)(?<=BeginB)s+Elements+(d+)s+d+.*?(?=EndB)', r'Element ABC', s)
doesn't work directly, so I guess we should modify the replacement string r'Element ABC'
, how?– Basj
Nov 8 at 17:28
Ok thanks @rahlf23 !
– Basj
Nov 8 at 17:36
Ok thanks @rahlf23 !
– Basj
Nov 8 at 17:36
To clarify, from your example, you would want to replace
12
, 198
and 12
again correct?– rahlf23
Nov 8 at 17:37
To clarify, from your example, you would want to replace
12
, 198
and 12
again correct?– rahlf23
Nov 8 at 17:37
Yes indeed. All the lines inside a BeginB...EndB block which are of the form
Element .... 38
. IRL I have a few (but not many) BeginB...EndB blocks, thousands of elements inside them, and other blocks BeginA...EndA, BeginC...EndC, etc.– Basj
Nov 8 at 17:40
Yes indeed. All the lines inside a BeginB...EndB block which are of the form
Element .... 38
. IRL I have a few (but not many) BeginB...EndB blocks, thousands of elements inside them, and other blocks BeginA...EndA, BeginC...EndC, etc.– Basj
Nov 8 at 17:40
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53212715%2ffilter-elements-with-a-regex-only-if-they-are-in-a-certain-block%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Your code is actually working. I don't see how the output is different from what you want.
– ninesalt
Nov 8 at 17:08
@ninesalt I want to replace only the elements which are inside a
BeginB...EndB
block, not those which are inBeginA...EndA
blocks.– Basj
Nov 8 at 17:10