Need help figuring out short regex – how to match 1 a char, then either of a set including the char, then another char

Advertisements My regex does not match phrases as intended, and I don’t know if it’s possible or not to do what I’m trying to. Intended match (as string progresses) phrase starts with t FIRST character after beginning ‘t’ must not be ‘t’ has any number of ‘t’ or ‘y’ characters (can be 0) must end… Read More Need help figuring out short regex – how to match 1 a char, then either of a set including the char, then another char

Regex to split a column in R after the second pipe and after the second T

Advertisements I have a 1 column dataframe of thousands of lines all built on the same pattern, for example: ids <- c("ETC|HMPI01000001|HMPI01000001.1 TAG: Genus Species, T05X3Ml2_CL10007Cordes1_1","ETC|HMPI31000002|HMPI31000002.1 TAG: Genus Species, T3X3Ml2_CL10157Cordes1_1", "ETC|HMPI01000007|HMPI01000007.1 TAG: Genus Species, T1X3Ml2_CL11231Cordes1_1") df <- as.data.frame(ids) > df ids 1 ETC|HMPI01000001|HMPI01000001.1 TAG: Genus Species, T05X3Ml2_CL10007Cordes1_1 2 ETC|HMPI31000002|HMPI31000002.1 TAG: Genus Species, T3X3Ml2_CL10157Cordes1_1 3 ETC|HMPI01000007|HMPI01000007.1… Read More Regex to split a column in R after the second pipe and after the second T

Regex for finding string after the second occurrence of the character

Advertisements The problem is to get from the string ‘https://myapp-ui.private.dev.mysubdom.eu&#8217; the substring ‘dev.mysubdom.eu’ without the private. So, in other words, I want to to get the substring after the occurrence of the second dot, the character ‘.’. What I tried and works (so the next string after the occurrence of the first dot) : to… Read More Regex for finding string after the second occurrence of the character

How to match fixed length string with quantifiers

Advertisements I have strings like this: 123456-0001 123456-0012 123456-0123 How to match with next conditions: chars count after – should be 4 zeros count variable – from 1 to 3 I found ^\d{6}-0+([1-9]+)$ pattern but it matches for 123456-001 or 123456-00001. >Solution : You can use ^\d{6}-(?=\d{4}$)0+([1-9]\d*)$ See the regex demo. Details: ^ – start… Read More How to match fixed length string with quantifiers

Extracting maximum number from DataFrame of strings (and some NaN values)

Advertisements Look at the DataFrame: import pandas as pd import numpy as np data=pd.DataFrame([‘random 15 numbers 128 and 12 letters’,’12-5′,’page 65′],columns=[‘text’]) I want to extract all numbers from the strings and write the maximum number into a new column. I achieved that with this code: data[‘list’]=data[‘text’].str.extractall(‘(\d+)’).unstack().values.tolist() data[‘max’]=data[‘list’].apply(lambda row:max([int(x) for x in row if x is… Read More Extracting maximum number from DataFrame of strings (and some NaN values)

How do I parse a file name format like 'PUBLIC001' with a Regular Expression?

Advertisements Need help with a regular expression that parses a file name File will be named PUBLIC001 ‘PUBLIC’ is static text in all file names Last 3 digits- day of the year.001(Jan 1)-366(Dec31st on a leap year) is valid range What would be regular expression. Is there a way to limit the max to 366?… Read More How do I parse a file name format like 'PUBLIC001' with a Regular Expression?

Replace little text format hashtag in string using javascript

Advertisements I have a string returned by the LinkedIn API that contains a number of hashtags. They are formatted like this: {hashtag|\#|somehashtag} I am trying to use a regex with String.replaceAll that will replace all occurrence of these hashtags to a standard hashtag notation like this: #somehashtag I think this regex will identify the hashtags… Read More Replace little text format hashtag in string using javascript