Skip to content

Commit

Permalink
feat: add isGurmukhi function to check if string is Unicode Gurmukhi (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
sarabveer authored May 30, 2020
1 parent d8e76d1 commit 32a151b
Show file tree
Hide file tree
Showing 8 changed files with 119 additions and 4 deletions.
15 changes: 13 additions & 2 deletions README.hbs
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,28 @@ Want to speak with us? <p>[![Slack](https://slack.shabados.com/badge.svg)](https

The library can be imported into Node as below:
```javascript
const { toUnicode, toAscii, firstLetters, transliterate, toShahmukhi } = require('gurmukhi-utils')
const {
toUnicode,
toAscii,
firstLetters,
toEnglish,
toHindi,
toShahmukhi,
stripAccents,
stripVishraams,
isGurmukhi,
} = require( 'gurmukhi-utils' )

toUnicode('Koj') // => ਖੋਜ
toAscii('ਖੋਜ') // => Koj
firstLetters('hir hir hir gunI') // => hhhg
firstLetters('ਹਰਿ ਹਰਿ ਹਰਿ ਗੁਨੀ') // => ਹਹਹਗ
transliterate('hukmI hukmu clwey rwhu ]') // => hukamee hukam chalaae raahu ||
toEnglish('hukmI hukmu clwey rwhu ]') // => hukamee hukam chalaae raahu ||
toHindi('ਕੁਲ ਜਨ ਮਧੇ ਮਿਲੵੋਿ ਸਾਰਗ ਪਾਨ ਰੇ ॥') // => कुल जन मधे मिल्यो सारग पान रे ॥
toShahmukhi('ਹਰਿ ਹਰਿ ਹਰਿ ਗੁਨੀ') // => هر هر هر گُنی
stripAccents('ਜ਼ਫ਼ੈਸ਼ਸ') // => ਜਫੈਸਸ
stripVishraams('sbid mrY. so mir rhY; iPir.') // => sbid mrY so mir rhY iPir
isGurmukhi('ਗੁਰਮੁਖੀ') // t=> true
```

Additionally, the package is available for web use via [unpkg CDN](https://unpkg.com/gurmukhi-utils).
Expand Down
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Want to speak with us? <p>[![Slack](https://slack.shabados.com/badge.svg)](https
- [Usage](#usage)
- [API](#api)
* [firstLetters(line, [stripNukta], [withVishraams]) ⇒ String](#firstlettersline-stripnukta-withvishraams-%E2%87%92-string)
* [isGurmukhi(text, [exhaustive]) ⇒ boolean](#isgurmukhitext-exhaustive-%E2%87%92-boolean)
* [stripAccents(text) ⇒ String](#stripaccentstext-%E2%87%92-string)
* [stripVishraams(text, options) ⇒ String](#stripvishraamstext-options-%E2%87%92-string)
* [toAscii(text) ⇒ String](#toasciitext-%E2%87%92-string)
Expand Down Expand Up @@ -96,6 +97,20 @@ firstLetters('iZir&qym sMdUk* drIXw AmIk* ]', false) // => Zsda
```js
firstLetters('sbid mrY. so mir rhY; iPir. mrY n, dUjI vwr ]', true, true) // => sm.smr;P.mn,dv
```
### isGurmukhi(text, [exhaustive]) ⇒ <code>boolean</code>
Checks if first char in string is part of the Gurmukhi Unicode block.

**Returns**: <code>boolean</code> - True if Unicode Gurmukhi, false if other.

| Param | Type | Description |
| --- | --- | --- |
| text | <code>String</code> | The text to check. |
| [exhaustive] | <code>boolean</code> | If `true`, checks if the whole string is Unicode Gurmukhi. |

**Example**
```js
isGurmukhi('ਗੁਰਮੁਖੀ') // => trueisGurmukhi('gurmuKI') // => false
```
### stripAccents(text) ⇒ <code>String</code>
Removes accents from ASCII/Unicode Gumrukhi letters with their base letter.
Useful for generalising search queries.
Expand Down
15 changes: 15 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Want to speak with us? <p>[![Slack](https://slack.shabados.com/badge.svg)](https
- [Usage](#usage)
- [API](#api)
* [firstLetters(line, [stripNukta], [withVishraams]) ⇒ String](#firstlettersline-stripnukta-withvishraams-%E2%87%92-string)
* [isGurmukhi(text, [exhaustive]) ⇒ boolean](#isgurmukhitext-exhaustive-%E2%87%92-boolean)
* [stripAccents(text) ⇒ String](#stripaccentstext-%E2%87%92-string)
* [stripVishraams(text, options) ⇒ String](#stripvishraamstext-options-%E2%87%92-string)
* [toAscii(text) ⇒ String](#toasciitext-%E2%87%92-string)
Expand Down Expand Up @@ -96,6 +97,20 @@ firstLetters('iZir&qym sMdUk* drIXw AmIk* ]', false) // => Zsda
```js
firstLetters('sbid mrY. so mir rhY; iPir. mrY n, dUjI vwr ]', true, true) // => sm.smr;P.mn,dv
```
### isGurmukhi(text, [exhaustive]) ⇒ <code>boolean</code>
Checks if first char in string is part of the Gurmukhi Unicode block.

**Returns**: <code>boolean</code> - True if Unicode Gurmukhi, false if other.

| Param | Type | Description |
| --- | --- | --- |
| text | <code>String</code> | The text to check. |
| [exhaustive] | <code>boolean</code> | If `true`, checks if the whole string is Unicode Gurmukhi. |

**Example**
```js
isGurmukhi('ਗੁਰਮੁਖੀ') // => trueisGurmukhi('gurmuKI') // => false
```
### stripAccents(text) ⇒ <code>String</code>
Removes accents from ASCII/Unicode Gumrukhi letters with their base letter.
Useful for generalising search queries.
Expand Down
7 changes: 5 additions & 2 deletions example.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,19 @@ const {
toAscii,
toUnicode,
firstLetters,
transliterate,
toEnglish,
stripAccents,
stripVishraams,
isGurmukhi
} = require( 'gurmukhi-utils' )

console.log(toUnicode( 'Koj' ))
console.log(toAscii('ਖੋਜ'))
console.log(firstLetters( 'hir hir hir gun gwvhu ]' ))
console.log(firstLetters( 'ਹਰਿ ਹਰਿ ਹਰਿ ਗੁਨੀ' ))
console.log(transliterate( 'ਹੁਕਮੀ ਹੁਕਮੁ ਚਲਾਏ ਰਾਹੁ ॥' ))
console.log(toEnglish( 'ਹੁਕਮੀ ਹੁਕਮੁ ਚਲਾਏ ਰਾਹੁ ॥' ))
console.log(toShahmukhi( 'ਹਰਿ ਹਰਿ ਹਰਿ ਗੁਨੀ' ))
console.log(toHindi( 'ਕੁਲ ਜਨ ਮਧੇ ਮਿਲੵੋਿ ਸਾਰਗ ਪਾਨ ਰੇ ॥' ))
console.log(stripAccents('ਜ਼ਫ਼ੈਸ਼ਸ'))
console.log(stripVishraams('sbid mrY. so mir rhY; iPir.'))
console.log(isGurmukhi('ਗੁਰਮੁਖੀ'))
2 changes: 2 additions & 0 deletions index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ export function toShamukhi(text: string): string

export function firstLetters(text: string, stripNukta?: boolean = true, withVishraams?: boolean): string

export function isGurmukhi(text: string, exhaustive?: boolean): boolean

export function stripAccents(text: string): string
interface StripVishraamsOptions {
heavy?: boolean;
Expand Down
2 changes: 2 additions & 0 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ const firstLetters = require( './lib/firstLetters' )
const toEnglish = require( './lib/toEnglish' )
const toShahmukhi = require( './lib/toShahmukhi' )
const toHindi = require( './lib/toHindi' )
const isGurmukhi = require( './lib/isGurmukhi' )
const stripAccents = require( './lib/stripAccents' )
const stripVishraams = require( './lib/stripVishraams' )

Expand All @@ -14,6 +15,7 @@ module.exports = {
toEnglish,
toShahmukhi,
toHindi,
isGurmukhi,
stripAccents,
stripVishraams,
}
32 changes: 32 additions & 0 deletions lib/isGurmukhi.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
const vishraams = Object.values( require( './vishraams.json' ) )

// Checks if Unicode Text is in Gurmukhi Block (U+0A00 - U+0A7F)
const checkCharCode = text => ( text.charCodeAt( 0 ) >= 2560 && text.charCodeAt( 0 ) <= 2687 )

// Characters to filter from text if doing an exhaustive check
const filteredChars = [
' ',
'\u200B',
'।',
'॥',
...vishraams,
]

/**
* Checks if first char in string is part of the Gurmukhi Unicode block.
* @param {String} text The text to check.
* @param {boolean} [exhaustive] If `true`, checks if the whole string is Unicode Gurmukhi.
* @return {boolean} True if Unicode Gurmukhi, false if other.
* @example
* isGurmukhi('ਗੁਰਮੁਖੀ') // => true
* isGurmukhi('gurmuKI') // => false
*/
const isGurmukhi = ( text, exhaustive ) => (
exhaustive
? text.split( '' )
.filter( i => !filteredChars.includes( i ) )
.every( checkCharCode )
: checkCharCode( text )
)

module.exports = isGurmukhi
35 changes: 35 additions & 0 deletions test/isGurmukhi.spec.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
const { expect } = require( 'chai' )

const { isGurmukhi } = require( '../index' )

describe( 'isGurmukhi(line)', () => {
const lines = [
[ 'ਗੁਰਮੁਖੀ', true ],
[ 'ਮੈਂ ਗੁਰਮੁਖੀ ਵਿਚ ਲਿਖ ਰਿਹਾ ਹਾਂ।', true ],
[ 'ਲੜੀਵਾਰ​ਗੁਰਬਾਣੀ', true ], // Has U+200B, Zero Width Space
[ 'मैं हिंदी में लिख रहा हूँ।', false ],
[ 'میں شاہ رخ میں لکھ رہا ہوں۔', false ],
[ 'ਗੁਰਮੁਖੀ & English', true ],
[ 'English & ਗੁਰਮੁਖੀ', false ],
]

lines.map( ( [ string, result ] ) => it( `String '${string}' should return ${result}`, () => {
expect( isGurmukhi( string ) ).to.equal( result )
} ) )
} )

describe( 'isGurmukhi(line, true)', () => {
const lines = [
[ 'ਗੁਰਮੁਖੀ', true ],
[ 'ਮੈਂ ਗੁਰਮੁਖੀ ਵਿਚ ਲਿਖ ਰਿਹਾ ਹਾਂ।', true ],
[ 'ਲੜੀਵਾਰ​ਗੁਰਬਾਣੀ', true ], // Has U+200B, Zero Width Space
[ 'मैं हिंदी में लिख रहा हूँ।', false ],
[ 'میں شاہ رخ میں لکھ رہا ہوں۔', false ],
[ 'ਗੁਰਮੁਖੀ & English', false ],
[ 'English & ਗੁਰਮੁਖੀ', false ],
]

lines.map( ( [ string, result ] ) => it( `String '${string}' should return ${result}`, () => {
expect( isGurmukhi( string, true ) ).to.equal( result )
} ) )
} )

0 comments on commit 32a151b

Please sign in to comment.