Edit functions to capture numbers and fractions from a text file#4
Edit functions to capture numbers and fractions from a text file#4ab-majeed wants to merge 3 commits into
Conversation
| # use regex to match numbers and fractions | ||
| import re | ||
| frac_pattern = r'(-?\d+\\[A-Za-z]+{\d+}{\d+} |-?\\[A-Za-z]+{\d+}{\d+})' | ||
| number_pattern = r"([^a-zA-Z{./\\}]\d+ |\d+\.\d+|[:]\d+|\d+[:])" |
There was a problem hiding this comment.
this pattern will capture numbers preceded by : i.e 23: which is no a number
| # in article.txt | ||
| # use regex to match numbers and fractions | ||
| import re | ||
| frac_pattern = r'(-?\d+\\[A-Za-z]+{\d+}{\d+} |-?\\[A-Za-z]+{\d+}{\d+})' |
There was a problem hiding this comment.
this pattern will capture any text that starts with \ .... try specifying the text frac or tinyfrac
| """ | ||
| pass #TODO update the fucntion to pass test | ||
|
|
||
| count = -1 |
Mrsatatima
left a comment
There was a problem hiding this comment.
though it pass all test cases... its capturing invalid numbers and skipping valid numbers.. try converting the article file into list for more efficiency
| count = -1 | ||
|
|
||
| with open(text, "r") as f: | ||
| article = f.read() |
There was a problem hiding this comment.
try using split() to convert the string to list for better capture because RE have no stop or continue... but if you split by space u get list of string and can use each string to match pattern and can use $ to make sure the number not followed by invalid character check read me again
| print(x.group(0)) | ||
| count += 1 | ||
| return count | ||
| numbers.append(x.group(0)) |
There was a problem hiding this comment.
print this list you can see it is capturing newline.... which is not correct.. update your pattern
Mrsatatima
left a comment
There was a problem hiding this comment.
your fraction pattern is effective... but capturing numbers is still the issue
| import re | ||
| frac_pattern = r'(-?\d+\\tinyfrac{\d+}{\d+}[^A-Za-z]|-?\\frac{\d+}{\d+}[^A-Za-z])' | ||
| number_pattern = r'([^a-zA-Z{./\\}]\d+ |\d+\.\d+| \d+[^/:\\)-}])' | ||
| number_pattern = r"([^\na-zA-Z{./\\}]\d+ |\d+\.\d+|\d+[^\n/:\\)-}])" |
There was a problem hiding this comment.
much better pattern.. but still lacking.. anyways good job
Mrsatatima
left a comment
There was a problem hiding this comment.
great effort.. but pytest is failing.. i recommended the other time.. instead of passing the whole string to the finditer.. convert the large string to list(array) of strings then loop through it and match the pattern using re.search, re.match or findall...
will be posting a better way to do it soon... Good effort, will be closing the pull request.. have already graded you... but if you want to get the full score make the changes stated above and make a new pull request... Congrats
The functions worked successfully