Why does the literal string """"""" (seven quotes) give an error?

Processing clients input we often use the strip() method. If we wanna remove starting-ending symbols from some specific set we just place all it in the parameter.
The code

".yes' ".strip(". '")

obviously gives 'yes' string as a result.
When I try to remove set ' ". The result depends from this symbols order. Variant ".yes' ".strip(""" ."'""") works properly, when variant with symbol " at the end gives the SyntaxError: unterminated string literal (detected at line 1).

So the question is: "why literal string """"""" gives error? It is just the same '"'!"

Updated 1. They say I have wrong mental model of literals. SO lets look documentation:

Triple quoted: ”’Three single quotes”’, """Three double quotes"""

SO? We have to rewrite documentation or interpretator?

Oh. Thanks for minuses. It do not proves my word about Python when I teach students online. I say: ‘1st difference for Python is large friendly sosiety’ )))

Updated 2 In another words I proposed rewrite the LOGIC of interpreter. Because my example starts from """ and ends with """ and have one symbol inside. It differs from ''' bc pairs '' have no to use the same symbol ' between. I am not using """ inside """ """ pair. See you the difference?

Updated 3. Look at the Language referance. Click here to verify. So

  1. longstring i.e. """longstringitem"""
  2. longstringitem may be a single char.

So why you gives minuses to my question??
Be more wise please or more lovely )

>Solution :

This reflects the documented behaviour as per the Python language spec around lexical analysis of strings:

In triple-quoted literals, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the literal. (A “quote” is the character used to open the literal, i.e. either ‘ or ".)

The crucial point here is that "three unescaped quotes in a row terminate the literal". So if you begin a literal with """, that literal ends as soon as another """ sequence is encountered: the parser doesn’t look ahead of that to try to infer a different endpoint for the literal.

When the parser encounters """"""" (a run of seven double-quotes), therefore:

  1. The 1st, 2nd and 3rd characters tell the parser it’s dealing with a literal delimited by triple double-quotes.
  2. The 4th, 5th and 6th characters constitute those "three unescaped quotes" so they terminate the literal.
  3. The 7th character is then " with no following " that it can be paired with, so that 7th character constitutes an unterminated literal. The parser fails with SyntaxError: unterminated string literal.

Leave a Reply