Smart Encoding

Smart Encoding is an optional feature that automatically transliterates Unicode characters to their GSM-7 equivalents, potentially reducing the number of message segments required to send your SMS.

SMS messages can be encoded in two ways:

  • GSM-7: Supports 160 characters in the first segment/message (or 153 in concatenated messages);

  • UCS-2 (Unicode): Required for special characters, but only supports 70 characters in the first segment/message (or 67 in concatenated messages);

When your message contains certain Unicode characters (such as "smart quotes" from word processors), the entire message must be sent as UCS-2, which significantly reduces the character limit. Smart Encoding attempts to replace these characters with GSM-7 compatible alternatives.

How to Enable

Add meta.sms.smartEncoding: true to your SMS message options:

POST /workspaces/{workspaceId}/channels/{channelId}/messages
{
  "receiver": {
    "contacts": [{ "identifierValue": "+1234567890" }]
  },
  "body": {
    "type": "text",
    "text": { "text": "“Hello”! Can’t wait for our meeting!" }
  },
  "meta": {
    "sms": {
      "smartEncoding": true
    }
  }
}

In Navigators, this feature works exactly the same.

How It Works

When smartEncoding is enabled:

  1. Transliteration: The system replaces known Unicode characters with GSM-7 equivalents (e.g., " , ...)

  2. Compatibility Check: After transliteration, the system checks if the result is fully GSM-7 compatible

  3. Outcome:

    1. Success: If all characters are now GSM-7-compatible, the transliterated text is used for segment calculation and transmission. The response is "success" in meta.extraInformation.smartEncoded.

    2. Failed: If non-GSM characters remain (e.g., emoji, CJK characters), the original message is sent as Unicode. The response is "failed" in meta.extraInformation.smartEncoded.

Example

Input message: “Hello” — welcome to our service…

After transliteration: "Hello" - welcome to our service...

This message now fits in GSM-7 encoding (160 char limit) instead of requiring Unicode (70 char limit).

Response

When smart encoding is enabled, the message response includes a smartEncoded flag in meta.extraInformation:

Message Object:

meta.extraInformation.smartEncoded value:

  • success: Transliteration succeeded; message sent as GSM-7;

  • failed: Transliteration attempted, but message still contains Unicode; sent as Unicode/UCS-2;

Character Mapping Reference

Character
Unicode
Replacement
Description

«

U+00AB

"

Left-pointing double angle quotation mark

»

U+00BB

"

Right-pointing double angle quotation mark

"

U+201C

"

Left double quotation mark

"

U+201D

"

Right double quotation mark

ʺ

U+02BA

"

Modifier letter double prime

ˮ

U+02EE

"

Modifier letter double apostrophe

U+201F

"

Double high-reversed-9 quotation mark

U+275D

"

Heavy double turned comma quotation mark ornament

U+275E

"

Heavy double comma quotation mark ornament

U+301D

"

Reversed double prime quotation mark

U+301E

"

Double prime quotation mark

U+FF02

"

Fullwidth quotation mark

U+201E

"

Double low-9 quotation mark

'

U+2018

'

Left single quotation mark

'

U+2019

'

Right single quotation mark

ʻ

U+02BB

'

Modifier letter turned comma

ˈ

U+02C8

'

Modifier letter vertical line

ʼ

U+02BC

'

Modifier letter apostrophe

ʽ

U+02BD

'

Modifier letter reversed comma

ʹ

U+02B9

'

Modifier letter prime

U+201B

'

Single high-reversed-9 quotation mark

U+FF07

'

Fullwidth apostrophe

´

U+00B4

'

Acute accent

ˊ

U+02CA

'

Modifier letter acute accent

`

U+0060

'

Grave accent

ˋ

U+02CB

'

Modifier letter grave accent

U+275B

'

Heavy single turned comma quotation mark ornament

U+275C

'

Heavy single comma quotation mark ornament

̓

U+0313

'

Combining comma above

̔

U+0314

'

Combining reversed comma above

U+FE10

'

Presentation form for vertical comma

U+FE11

'

Presentation form for vertical ideographic comma

÷

U+00F7

/

Division sign

¼

U+00BC

1/4

Vulgar fraction one quarter

½

U+00BD

1/2

Vulgar fraction one half

¾

U+00BE

3/4

Vulgar fraction three quarters

U+29F8

/

Big solidus

̷

U+0337

/

Combining short solidus overlay

̸

U+0338

/

Combining long solidus overlay

U+2044

/

Fraction slash

U+2215

/

Division slash

U+FF0F

/

Fullwidth solidus

U+29F9

\

Big reverse solidus

U+29F5

\

Reverse solidus operator

U+20E5

\

Combining reverse solidus overlay

U+FE68

\

Small reverse solidus

U+FF3C

\

Fullwidth reverse solidus

̲

U+0332

_

Combining low line

_

U+FF3F

_

Fullwidth low line

U+2017

_

Double low line

U+20D2

|

Combining long vertical line overlay

U+20D3

|

Combining short vertical line overlay

U+2223

|

Divides

U+FF5C

|

Fullwidth vertical line

U+23B8

|

Left vertical box line

U+23B9

|

Right vertical box line

U+23D0

|

Vertical line extension

U+239C

|

Left parenthesis extension

U+239F

|

Right parenthesis extension

U+23BC

-

Horizontal scan line-7

U+23BD

-

Horizontal scan line-9

U+2015

-

Horizontal bar

U+FE63

-

Small hyphen-minus

U+FF0D

-

Fullwidth hyphen-minus

U+2010

-

Hyphen

U+2022

-

Bullet

U+2043

-

Hyphen bullet

U+2014

-

Em dash

U+2013

-

En dash

U+FE6B

@

Small commercial at

U+FF20

@

Fullwidth commercial at

U+FE69

$

Small dollar sign

U+FF04

$

Fullwidth dollar sign

ǃ

U+01C3

!

Latin letter retroflex click

U+FE15

!

Presentation form for vertical exclamation mark

U+FE57

!

Small exclamation mark

U+FF01

!

Fullwidth exclamation mark

U+FE5F

#

Small number sign

U+FF03

#

Fullwidth number sign

U+FE6A

%

Small percent sign

U+FF05

%

Fullwidth percent sign

U+FE60

&

Small ampersand

U+FF06

&

Fullwidth ampersand

U+201A

,

Single low-9 quotation mark

̦

U+0326

,

Combining comma below

U+FE50

,

Small comma

U+3001

,

Ideographic comma

U+FE51

,

Small ideographic comma

U+FF0C

,

Fullwidth comma

U+FF64

,

Halfwidth ideographic comma

U+2768

(

Medium left parenthesis ornament

U+276A

(

Medium flattened left parenthesis ornament

U+FE59

(

Small left parenthesis

U+FF08

(

Fullwidth left parenthesis

U+27EE

(

Mathematical left flattened parenthesis

U+2985

(

Left white parenthesis

U+2769

)

Medium right parenthesis ornament

U+276B

)

Medium flattened right parenthesis ornament

U+FE5A

)

Small right parenthesis

U+FF09

)

Fullwidth right parenthesis

U+27EF

)

Mathematical right flattened parenthesis

U+2986

)

Right white parenthesis

U+204E

*

Low asterisk

U+2217

*

Asterisk operator

U+229B

*

Circled asterisk operator

U+2722

*

Four teardrop-spoked asterisk

U+2723

*

Four balloon-spoked asterisk

U+2724

*

Heavy four balloon-spoked asterisk

U+2725

*

Four club-spoked asterisk

U+2731

*

Heavy asterisk

U+2732

*

Open centre asterisk

U+2733

*

Eight spoked asterisk

U+273A

*

Sixteen pointed asterisk

U+273B

*

Teardrop-spoked asterisk

U+273C

*

Open centre teardrop-spoked asterisk

U+273D

*

Heavy teardrop-spoked asterisk

U+2743

*

Heavy teardrop-spoked pinwheel asterisk

U+2749

*

Balloon-spoked asterisk

U+274A

*

Eight teardrop-spoked propeller asterisk

U+274B

*

Heavy eight teardrop-spoked propeller asterisk

U+29C6

*

Squared asterisk

U+FE61

*

Small asterisk

U+FF0A

*

Fullwidth asterisk

˖

U+02D6

+

Modifier letter plus sign

U+FE62

+

Small plus sign

U+FF0B

+

Fullwidth plus sign

U+3002

.

Ideographic full stop

U+FE52

.

Small full stop

U+FF0E

.

Fullwidth full stop

U+FF61

.

Halfwidth ideographic full stop

-

U+FF10-U+FF19

0-9

Fullwidth digits

ː

U+02D0

:

Modifier letter triangular colon

˸

U+02F8

:

Modifier letter raised colon

U+2982

:

Z notation type colon

U+A789

:

Modifier letter colon

U+FE13

:

Presentation form for vertical colon

U+FF1A

:

Fullwidth colon

U+204F

;

Reversed semicolon

U+FE14

;

Presentation form for vertical semicolon

U+FE54

;

Small semicolon

U+FF1B

;

Fullwidth semicolon

U+FE64

<

Small less-than sign

U+FF1C

<

Fullwidth less-than sign

U+2039

<

Single left-pointing angle quotation mark

͇

U+0347

=

Combining equals sign below

U+A78A

=

Modifier letter short equals sign

U+FE66

=

Small equals sign

U+FF1D

=

Fullwidth equals sign

U+FE65

>

Small greater-than sign

U+FF1E

>

Fullwidth greater-than sign

U+203A

>

Single right-pointing angle quotation mark

U+FE16

?

Presentation form for vertical question mark

U+FE56

?

Small question mark

U+FF1F

?

Fullwidth question mark

-

U+FF21-+FF3A

A-Z

Fullwidth Latin capital letters

-

U+1D00-U+1D22

A-Z

Latin letters small capital

ˆ

U+02C6

^

Modifier letter circumflex accent

̂

U+0302

^

Combining circumflex accent

U+FF3E

^

Fullwidth circumflex accent

U+1DCD

^

Combining double circumflex above

U+2774

{

Medium left curly bracket ornament

U+FE5B

{

Small left curly bracket

U+FF5B

{

Fullwidth left curly bracket

U+2775

}

Medium right curly bracket ornament

U+FE5C

}

Small right curly bracket

U+FF5D

}

Fullwidth right curly bracket

U+FF3B

[

Fullwidth left square bracket

U+FF3D

]

Fullwidth right square bracket

˜

U+02DC

~

Small tilde

˷

U+02F7

~

Modifier letter low tilde

̃

U+0303

~

Combining tilde

̰

U+0330

~

Combining tilde below

̴

U+0334

~

Combining tilde overlay

U+223C

~

Tilde operator

U+FF5E

~

Fullwidth tilde

U+203C

!!

Double exclamation mark

U+2026

...

Horizontal ellipsis

U+00A0

No-break space

U+2000

En quad

U+2001

Em quad

U+2002

En space

U+2003

Em space

U+2004

Three-per-em space

U+2005

Four-per-em space

U+2006

Six-per-em space

U+2007

Figure space

U+2008

Punctuation space

U+2009

Thin space

U+200A

Hair space

U+202F

Narrow no-break space

U+205F

Medium mathematical space

U+3000

Ideographic space

U+200B

(removed)

Zero width space

U+FEFF

(removed)

Zero width no-break space (BOM)

U+2060

(removed)

Word joiner

U+2028

(removed)

Line separator

U+2029

(removed)

Paragraph separator

Limitations

Smart encoding cannot help with:

  • Emoji: 😀, ❤️, etc. require Unicode;

  • CJK characters: Chinese, Japanese, and Korean scripts;

  • Arabic, Hebrew, Thai: Scripts not in GSM-7;

  • Accented characters not in GSM-7: Characters like ő, ű (only basic accents like é, ü are in GSM-7);

Last updated

Was this helpful?