Text normalization made easy - for real

Cucco is here to help you to normalize that nasty text. Let this little friend do the hard work for you.

Try it now

Multiple transformations

Removing extra white spaces, replacing symbols, emojis, etc. There's a lot you can do with your text.

Easy to use

Do more with less code. Cucco offers the means to simplify your work.

50+ languages

Removing English stop words is nice but removing them in more than 50+ languages is better, right?

Use it from your terminal

Directly from your code or the command line. You will always find a way to normalize your text in a couple of steps.

Fully tested

Worried about broken updates? With a 100% test coverage we have your back. Find us on Codecov.

Public API

Do you want to test cucco but you don't want to bother installing it? Use our API.

Getting started
Use cucco to normalize your text. It's pretty easy. All it takes is a few lines of code.
Add some normalizations

Or you can simply choose not to add them and use the default normalizations. Your pick.

from cucco import Cucco

cucco = Cucco()

normalizations = [
    'remove_extra_whitespaces',
    ('replace_punctuation', {'replacement': ' '})
]

print(cucco.normalize('Who let       the   cucco out?',
                      normalizations))
Who let the cucco out