Warm tip: This article is reproduced from serverfault.com, please click

How do you state a regex pattern in Haskell?

发布于 2020-11-28 11:50:34

I'm trying to do a regex replace with the following code

import Text.RE.Replace
import Text.RE.TDFA.String

onlyLetters :: String -> String
onlyLetters s = replaceAll "" $ s *=~ [re|$([^a-zA-Z])|]

I found it really hard to find any comprehensible documentation on this. This produces the compile error:

    src\Pangram.hs:6:53: error: parse error on input `]'
  |
6 | onlyLetters s = replaceAll "" $ (s *=~ [re|[a-zA-Z]|])
  |                                                     ^

Progress 1/2

--  While building package pangram-2.0.0.12 (scroll up to its section to see the error) using:
      C:\sr\setup-exe-cache\x86_64-windows\Cabal-simple_Z6RU0evB_3.0.1.0_ghc-8.8.4.exe --builddir=.stack-work\dist\29cc6475 build lib:pangram test:test --ghc-options " -fdiagnostics-color=always"
    Process exited with code: ExitFailure 1
PS C:\Users\mcleg\Exercism\haskell\pangram> stack test
pangram> configure (lib + test)
Configuring pangram-2.0.0.12...
pangram> build (lib + test)
Preprocessing library for pangram-2.0.0.12..
Building library for pangram-2.0.0.12..
[1 of 2] Compiling Pangram

src\Pangram.hs:7:56: error: parse error on input `]'
  |
7 | onlyLetters s = replaceAll "" $ s *=~ [re|$([^a-zA-Z])|]
  |                                                        ^

Progress 1/2

--  While building package pangram-2.0.0.12 (scroll up to its section to see the error) using:
      C:\sr\setup-exe-cache\x86_64-windows\Cabal-simple_Z6RU0evB_3.0.1.0_ghc-8.8.4.exe --builddir=.stack-work\dist\29cc6475 build lib:pangram test:test --ghc-options " -fdiagnostics-color=always"
    Process exited with code: ExitFailure 1

What is the problem with that bracket and how would I do this correctly? Thank you -Skye

Questioner
Skye Sprung
Viewed
0
Willem Van Onsem 2020-11-28 19:56:18

The […|…|] is quasi quotation syntax [haskell-wiki]. This is an extension of Haskell's syntax and not enabled by default.

You can turn this on with a LANGUAGE pragma:

{-# LANGUAGE QuasiQuotes #-}

import Text.RE.Replace
import Text.RE.TDFA.String

onlyLetters :: String -> String
onlyLetters s = replaceAll "" $ s *=~ [re|$([^a-zA-Z])|]

The quasiquotes will generate Haskell code and this is then used in the Haskell program. This means that through the quasiquotes, validation of the regex can be done at compile time and might even slightly optimize efficiency compared to compiling the regex at runtime.

For the given onlyLetters function, we then get:

*Main> onlyLetters "fo0b4r"
"fobr"