Split SQL statements on function name but keep delimiter in Python2019 Community Moderator ElectionParse CASE WHEN statements with sqlparseCalling a function of a module by using its name (a string)Replacements for switch statement in Python?What is the naming convention in Python for variable and function names?How do I split a string with any whitespace chars as delimiters?How do I split a string on a delimiter in Bash?Split Strings into words with multiple word boundary delimitersHow can I do an UPDATE statement with JOIN in SQL?Find all tables containing column with specified name - MS SQL ServerSplit string with multiple delimiters in PythonSplit string on whitespace in Python

How can I discourage/prevent PCs from using door choke-points?

Do Bugbears' arms literally get longer when it's their turn?

Is it ok to include an epilogue dedicated to colleagues who passed away in the end of the manuscript?

How do anti-virus programs start at Windows boot?

How to deal with a cynical class?

Is this animal really missing?

Making a sword in the stone, in a medieval world without magic

How is the Swiss post e-voting system supposed to work, and how was it wrong?

Deleting missing values from a dataset

Coworker uses her breast-pump everywhere in the office

When is a batch class instantiated when you schedule it?

Why do Australian milk farmers need to protest supermarkets' milk price?

Ban on all campaign finance?

What does it mean when multiple 々 marks follow a 、?

Is all copper pipe pretty much the same?

What is the definition of "Natural Selection"?

Running a subshell from the middle of the current command

Why don't MCU characters ever seem to have language issues?

What exactly is the purpose of connection links straped between the rocket and the launch pad

Do I need to leave some extra space available on the disk which my database log files reside, for log backup operations to successfully occur?

Who is our nearest neighbor

How does Dispel Magic work against Stoneskin?

What happens with multiple copies of Humility and Glorious Anthem on the battlefield?

It's a yearly task, alright



Split SQL statements on function name but keep delimiter in Python



2019 Community Moderator ElectionParse CASE WHEN statements with sqlparseCalling a function of a module by using its name (a string)Replacements for switch statement in Python?What is the naming convention in Python for variable and function names?How do I split a string with any whitespace chars as delimiters?How do I split a string on a delimiter in Bash?Split Strings into words with multiple word boundary delimitersHow can I do an UPDATE statement with JOIN in SQL?Find all tables containing column with specified name - MS SQL ServerSplit string with multiple delimiters in PythonSplit string on whitespace in Python










1















Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);



 SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?



So the desired output would be a string like the one below:



 SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.










share|improve this question









New contributor




Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Have you tried to split by comma (,) ?

    – Ralf
    Mar 7 at 11:05











  • @Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

    – Old-School
    Mar 7 at 11:07












  • Hm... and .split(',n') ?

    – Ralf
    Mar 7 at 11:09











  • @Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

    – Old-School
    Mar 7 at 11:11















1















Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);



 SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?



So the desired output would be a string like the one below:



 SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.










share|improve this question









New contributor




Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Have you tried to split by comma (,) ?

    – Ralf
    Mar 7 at 11:05











  • @Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

    – Old-School
    Mar 7 at 11:07












  • Hm... and .split(',n') ?

    – Ralf
    Mar 7 at 11:09











  • @Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

    – Old-School
    Mar 7 at 11:11













1












1








1








Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);



 SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?



So the desired output would be a string like the one below:



 SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.










share|improve this question









New contributor




Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);



 SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?



So the desired output would be a string like the one below:



 SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.







python sql regex split






share|improve this question









New contributor




Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited Mar 7 at 12:27









Martijn Pieters

719k14025112320




719k14025112320






New contributor




Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Mar 7 at 11:01









Old-SchoolOld-School

155




155




New contributor




Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • Have you tried to split by comma (,) ?

    – Ralf
    Mar 7 at 11:05











  • @Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

    – Old-School
    Mar 7 at 11:07












  • Hm... and .split(',n') ?

    – Ralf
    Mar 7 at 11:09











  • @Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

    – Old-School
    Mar 7 at 11:11

















  • Have you tried to split by comma (,) ?

    – Ralf
    Mar 7 at 11:05











  • @Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

    – Old-School
    Mar 7 at 11:07












  • Hm... and .split(',n') ?

    – Ralf
    Mar 7 at 11:09











  • @Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

    – Old-School
    Mar 7 at 11:11
















Have you tried to split by comma (,) ?

– Ralf
Mar 7 at 11:05





Have you tried to split by comma (,) ?

– Ralf
Mar 7 at 11:05













@Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

– Old-School
Mar 7 at 11:07






@Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

– Old-School
Mar 7 at 11:07














Hm... and .split(',n') ?

– Ralf
Mar 7 at 11:09





Hm... and .split(',n') ?

– Ralf
Mar 7 at 11:09













@Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

– Old-School
Mar 7 at 11:11





@Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

– Old-School
Mar 7 at 11:11












3 Answers
3






active

oldest

votes


















1














You can't use a regular expression here, because SQL syntax does not form regular patterns you could match with the Python re engine. You'd have to actually parse the string into a token stream or syntax tree; your SUM(...) can contain a wide array of syntax, including sub-selects, after all.



The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.



Re-using the walk_tokens function I defined in the other post I linked to:



from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
queue = deque([token])
while queue:
token = queue.popleft()
if isinstance(token, TokenList):
queue.extend(token)
yield token


extracting the last element from the SELECT identifier list then is:



import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
if isinstance(tok, IdentifierList):
# iterate to leave the last assigned to `identifier`
for identifier in tok.get_identifiers():
pass
break

print(identifier)


Demo:



>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.






share|improve this answer

























  • Wouldn't be much easier with a regular expression?

    – Old-School
    Mar 7 at 11:35











  • @Old-School: no, because Python's regular expressions can't be used to parse nested structures.

    – Martijn Pieters
    Mar 7 at 11:36











  • @Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

    – Martijn Pieters
    Mar 7 at 11:37











  • Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

    – Old-School
    Mar 7 at 11:40











  • @Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

    – Martijn Pieters
    Mar 7 at 11:55


















0














I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.



s = """
SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
new_parts = []
for part in current_parts:
for i, new_part in enumerate(part.split(kw)):
if i > 0:
# add keyword to the start of this substring
new_part = ''.format(kw, new_part)

new_part = new_part.strip()
if len(new_part) > 0:
new_parts.append(new_part.strip())

current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
print(s)


The output I get is:



current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2


Does it work for you? It seems to work for the example string you put in the question.






share|improve this answer

























  • But I have already told you that there are not new lines (n) in the string.

    – Old-School
    Mar 7 at 11:28












  • The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

    – Ralf
    Mar 7 at 11:29











  • Right OK. Thanks for the attempt.

    – Old-School
    Mar 7 at 11:30


















0














You could use something like:



import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
end = reg.end()
if(len(str) > end and str[end] == ','):
commas.append(end)

idx = 0
for comma in commas:
parts.append(str[idx:comma])
idx = comma + 1
parts.append(str[idx:])

print(parts)


In commas array you will have the commas that need to be splitted. Output will be:



[151, 322]


In parts you'll have the final array with the parts (Not sure if this implementation is the best way):



[
'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]





share|improve this answer

























  • Is the output of this code even close to the desired output I've posted in my question?

    – Old-School
    Mar 7 at 11:47











  • Edited for you, check now!

    – ALFA
    Mar 7 at 11:58










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);






Old-School is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042234%2fsplit-sql-statements-on-function-name-but-keep-delimiter-in-python%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














You can't use a regular expression here, because SQL syntax does not form regular patterns you could match with the Python re engine. You'd have to actually parse the string into a token stream or syntax tree; your SUM(...) can contain a wide array of syntax, including sub-selects, after all.



The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.



Re-using the walk_tokens function I defined in the other post I linked to:



from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
queue = deque([token])
while queue:
token = queue.popleft()
if isinstance(token, TokenList):
queue.extend(token)
yield token


extracting the last element from the SELECT identifier list then is:



import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
if isinstance(tok, IdentifierList):
# iterate to leave the last assigned to `identifier`
for identifier in tok.get_identifiers():
pass
break

print(identifier)


Demo:



>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.






share|improve this answer

























  • Wouldn't be much easier with a regular expression?

    – Old-School
    Mar 7 at 11:35











  • @Old-School: no, because Python's regular expressions can't be used to parse nested structures.

    – Martijn Pieters
    Mar 7 at 11:36











  • @Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

    – Martijn Pieters
    Mar 7 at 11:37











  • Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

    – Old-School
    Mar 7 at 11:40











  • @Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

    – Martijn Pieters
    Mar 7 at 11:55















1














You can't use a regular expression here, because SQL syntax does not form regular patterns you could match with the Python re engine. You'd have to actually parse the string into a token stream or syntax tree; your SUM(...) can contain a wide array of syntax, including sub-selects, after all.



The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.



Re-using the walk_tokens function I defined in the other post I linked to:



from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
queue = deque([token])
while queue:
token = queue.popleft()
if isinstance(token, TokenList):
queue.extend(token)
yield token


extracting the last element from the SELECT identifier list then is:



import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
if isinstance(tok, IdentifierList):
# iterate to leave the last assigned to `identifier`
for identifier in tok.get_identifiers():
pass
break

print(identifier)


Demo:



>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.






share|improve this answer

























  • Wouldn't be much easier with a regular expression?

    – Old-School
    Mar 7 at 11:35











  • @Old-School: no, because Python's regular expressions can't be used to parse nested structures.

    – Martijn Pieters
    Mar 7 at 11:36











  • @Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

    – Martijn Pieters
    Mar 7 at 11:37











  • Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

    – Old-School
    Mar 7 at 11:40











  • @Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

    – Martijn Pieters
    Mar 7 at 11:55













1












1








1







You can't use a regular expression here, because SQL syntax does not form regular patterns you could match with the Python re engine. You'd have to actually parse the string into a token stream or syntax tree; your SUM(...) can contain a wide array of syntax, including sub-selects, after all.



The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.



Re-using the walk_tokens function I defined in the other post I linked to:



from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
queue = deque([token])
while queue:
token = queue.popleft()
if isinstance(token, TokenList):
queue.extend(token)
yield token


extracting the last element from the SELECT identifier list then is:



import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
if isinstance(tok, IdentifierList):
# iterate to leave the last assigned to `identifier`
for identifier in tok.get_identifiers():
pass
break

print(identifier)


Demo:



>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.






share|improve this answer















You can't use a regular expression here, because SQL syntax does not form regular patterns you could match with the Python re engine. You'd have to actually parse the string into a token stream or syntax tree; your SUM(...) can contain a wide array of syntax, including sub-selects, after all.



The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.



Re-using the walk_tokens function I defined in the other post I linked to:



from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
queue = deque([token])
while queue:
token = queue.popleft()
if isinstance(token, TokenList):
queue.extend(token)
yield token


extracting the last element from the SELECT identifier list then is:



import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
if isinstance(tok, IdentifierList):
# iterate to leave the last assigned to `identifier`
for identifier in tok.get_identifiers():
pass
break

print(identifier)


Demo:



>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2


identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 7 at 11:57

























answered Mar 7 at 11:34









Martijn PietersMartijn Pieters

719k14025112320




719k14025112320












  • Wouldn't be much easier with a regular expression?

    – Old-School
    Mar 7 at 11:35











  • @Old-School: no, because Python's regular expressions can't be used to parse nested structures.

    – Martijn Pieters
    Mar 7 at 11:36











  • @Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

    – Martijn Pieters
    Mar 7 at 11:37











  • Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

    – Old-School
    Mar 7 at 11:40











  • @Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

    – Martijn Pieters
    Mar 7 at 11:55

















  • Wouldn't be much easier with a regular expression?

    – Old-School
    Mar 7 at 11:35











  • @Old-School: no, because Python's regular expressions can't be used to parse nested structures.

    – Martijn Pieters
    Mar 7 at 11:36











  • @Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

    – Martijn Pieters
    Mar 7 at 11:37











  • Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

    – Old-School
    Mar 7 at 11:40











  • @Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

    – Martijn Pieters
    Mar 7 at 11:55
















Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35





Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35













@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters
Mar 7 at 11:36





@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters
Mar 7 at 11:36













@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters
Mar 7 at 11:37





@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters
Mar 7 at 11:37













Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40





Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40













@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters
Mar 7 at 11:55





@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters
Mar 7 at 11:55













0














I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.



s = """
SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
new_parts = []
for part in current_parts:
for i, new_part in enumerate(part.split(kw)):
if i > 0:
# add keyword to the start of this substring
new_part = ''.format(kw, new_part)

new_part = new_part.strip()
if len(new_part) > 0:
new_parts.append(new_part.strip())

current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
print(s)


The output I get is:



current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2


Does it work for you? It seems to work for the example string you put in the question.






share|improve this answer

























  • But I have already told you that there are not new lines (n) in the string.

    – Old-School
    Mar 7 at 11:28












  • The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

    – Ralf
    Mar 7 at 11:29











  • Right OK. Thanks for the attempt.

    – Old-School
    Mar 7 at 11:30















0














I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.



s = """
SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
new_parts = []
for part in current_parts:
for i, new_part in enumerate(part.split(kw)):
if i > 0:
# add keyword to the start of this substring
new_part = ''.format(kw, new_part)

new_part = new_part.strip()
if len(new_part) > 0:
new_parts.append(new_part.strip())

current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
print(s)


The output I get is:



current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2


Does it work for you? It seems to work for the example string you put in the question.






share|improve this answer

























  • But I have already told you that there are not new lines (n) in the string.

    – Old-School
    Mar 7 at 11:28












  • The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

    – Ralf
    Mar 7 at 11:29











  • Right OK. Thanks for the attempt.

    – Old-School
    Mar 7 at 11:30













0












0








0







I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.



s = """
SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
new_parts = []
for part in current_parts:
for i, new_part in enumerate(part.split(kw)):
if i > 0:
# add keyword to the start of this substring
new_part = ''.format(kw, new_part)

new_part = new_part.strip()
if len(new_part) > 0:
new_parts.append(new_part.strip())

current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
print(s)


The output I get is:



current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2


Does it work for you? It seems to work for the example string you put in the question.






share|improve this answer















I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.



s = """
SUM(case when(A.money-B.money>1000
and A.unixtime-B.unixtime<=890769
and B.col10 = "A"
and B.col11 = "12"
and B.col12 = "V") then 10
end) as finalCond0,
MAX(case when(A.money-B.money<0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "4321"
and B.cond3 in ("E", "F", "G")) then A.col10
end) as finalCond1,
SUM(case when(A.money-B.money>0
and A.unixtime-B.unixtime<=6786000
and B.cond1 = "A"
and B.cond2 = "1234"
and B.cond3 in ("A", "B", "C")) then 2
end) as finalCond2
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
new_parts = []
for part in current_parts:
for i, new_part in enumerate(part.split(kw)):
if i > 0:
# add keyword to the start of this substring
new_part = ''.format(kw, new_part)

new_part = new_part.strip()
if len(new_part) > 0:
new_parts.append(new_part.strip())

current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
print(s)


The output I get is:



current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2


Does it work for you? It seems to work for the example string you put in the question.







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 7 at 11:31

























answered Mar 7 at 11:27









RalfRalf

6,70141337




6,70141337












  • But I have already told you that there are not new lines (n) in the string.

    – Old-School
    Mar 7 at 11:28












  • The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

    – Ralf
    Mar 7 at 11:29











  • Right OK. Thanks for the attempt.

    – Old-School
    Mar 7 at 11:30

















  • But I have already told you that there are not new lines (n) in the string.

    – Old-School
    Mar 7 at 11:28












  • The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

    – Ralf
    Mar 7 at 11:29











  • Right OK. Thanks for the attempt.

    – Old-School
    Mar 7 at 11:30
















But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28






But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28














The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29





The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29













Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30





Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30











0














You could use something like:



import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
end = reg.end()
if(len(str) > end and str[end] == ','):
commas.append(end)

idx = 0
for comma in commas:
parts.append(str[idx:comma])
idx = comma + 1
parts.append(str[idx:])

print(parts)


In commas array you will have the commas that need to be splitted. Output will be:



[151, 322]


In parts you'll have the final array with the parts (Not sure if this implementation is the best way):



[
'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]





share|improve this answer

























  • Is the output of this code even close to the desired output I've posted in my question?

    – Old-School
    Mar 7 at 11:47











  • Edited for you, check now!

    – ALFA
    Mar 7 at 11:58















0














You could use something like:



import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
end = reg.end()
if(len(str) > end and str[end] == ','):
commas.append(end)

idx = 0
for comma in commas:
parts.append(str[idx:comma])
idx = comma + 1
parts.append(str[idx:])

print(parts)


In commas array you will have the commas that need to be splitted. Output will be:



[151, 322]


In parts you'll have the final array with the parts (Not sure if this implementation is the best way):



[
'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]





share|improve this answer

























  • Is the output of this code even close to the desired output I've posted in my question?

    – Old-School
    Mar 7 at 11:47











  • Edited for you, check now!

    – ALFA
    Mar 7 at 11:58













0












0








0







You could use something like:



import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
end = reg.end()
if(len(str) > end and str[end] == ','):
commas.append(end)

idx = 0
for comma in commas:
parts.append(str[idx:comma])
idx = comma + 1
parts.append(str[idx:])

print(parts)


In commas array you will have the commas that need to be splitted. Output will be:



[151, 322]


In parts you'll have the final array with the parts (Not sure if this implementation is the best way):



[
'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]





share|improve this answer















You could use something like:



import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
end = reg.end()
if(len(str) > end and str[end] == ','):
commas.append(end)

idx = 0
for comma in commas:
parts.append(str[idx:comma])
idx = comma + 1
parts.append(str[idx:])

print(parts)


In commas array you will have the commas that need to be splitted. Output will be:



[151, 322]


In parts you'll have the final array with the parts (Not sure if this implementation is the best way):



[
'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]






share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 7 at 11:57

























answered Mar 7 at 11:45









ALFAALFA

543212




543212












  • Is the output of this code even close to the desired output I've posted in my question?

    – Old-School
    Mar 7 at 11:47











  • Edited for you, check now!

    – ALFA
    Mar 7 at 11:58

















  • Is the output of this code even close to the desired output I've posted in my question?

    – Old-School
    Mar 7 at 11:47











  • Edited for you, check now!

    – ALFA
    Mar 7 at 11:58
















Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47





Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47













Edited for you, check now!

– ALFA
Mar 7 at 11:58





Edited for you, check now!

– ALFA
Mar 7 at 11:58










Old-School is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















Old-School is a new contributor. Be nice, and check out our Code of Conduct.












Old-School is a new contributor. Be nice, and check out our Code of Conduct.











Old-School is a new contributor. Be nice, and check out our Code of Conduct.














Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042234%2fsplit-sql-statements-on-function-name-but-keep-delimiter-in-python%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

2005 Ahvaz unrest Contents Background Causes Casualties Aftermath See also References Navigation menue"At Least 10 Are Killed by Bombs in Iran""Iran"Archived"Arab-Iranians in Iran to make April 15 'Day of Fury'"State of Mind, State of Order: Reactions to Ethnic Unrest in the Islamic Republic of Iran.10.1111/j.1754-9469.2008.00028.x"Iran hangs Arab separatists"Iran Overview from ArchivedConstitution of the Islamic Republic of Iran"Tehran puzzled by forged 'riots' letter""Iran and its minorities: Down in the second class""Iran: Handling Of Ahvaz Unrest Could End With Televised Confessions""Bombings Rock Iran Ahead of Election""Five die in Iran ethnic clashes""Iran: Need for restraint as anniversary of unrest in Khuzestan approaches"Archived"Iranian Sunni protesters killed in clashes with security forces"Archived