Open In Colab

Word Level Seq2Seq Model

Sequence-to-sequence Neural Machine Translation is an example of Conditional Language Model.

  • Language Model - Decoder is predicting the next word of the target sentence based on the sequence generated so far
  • Conditional - The predictions are conditioned on the source sentence x and the generated target seq

It calculate $P(y|x)$ where $x$ is the source sentence & $y$ is the target sentence. $$P(y|x)=P(y_{1}|x)P(y_{2}|y_{1},x)P(y_{3}|y_{1},y_{2},x)...P(y_{T}|y_{1},...,y_{T-1},x)$$

Any of the above term can be interpreted as probability of next word, given target words so far and source sentence x

Encoder

Encoder

Decoder

Decoder

Stepwise, the decoder operates as -

Decoder Operation

Dataset

English to Spanish Conversion - http://www.manythings.org/anki/spa-eng.zip

!wget http://www.manythings.org/anki/spa-eng.zip
--2020-04-20 06:06:41--  http://www.manythings.org/anki/spa-eng.zip
Resolving www.manythings.org (www.manythings.org)... 104.24.109.196, 104.24.108.196, 2606:4700:3033::6818:6dc4, ...
Connecting to www.manythings.org (www.manythings.org)|104.24.109.196|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4781548 (4.6M) [application/zip]
Saving to: ‘spa-eng.zip’

spa-eng.zip         100%[===================>]   4.56M  3.06MB/s    in 1.5s    

2020-04-20 06:06:43 (3.06 MB/s) - ‘spa-eng.zip’ saved [4781548/4781548]

!unzip -l spa-eng.zip
!unzip spa-eng.zip
Archive:  spa-eng.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     1441  2020-03-15 02:17   _about.txt
 18493172  2020-03-15 02:17   spa.txt
---------                     -------
 18494613                     2 files
Archive:  spa-eng.zip
  inflating: _about.txt              
  inflating: spa.txt                 
from collections import Counter
import matplotlib.pyplot as plt
from itertools import islice
import math
import numpy as np
import pandas as pd
import random
import re
import requests
import seaborn as sns
import string
from string import digits
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.callbacks import CSVLogger, EarlyStopping
from tensorflow.keras.layers import Input, LSTM, Embedding, Dense
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.utils import plot_model
from tensorflow.python.framework.ops import disable_eager_execution, enable_eager_execution
disable_eager_execution()

Note: Disabling eager execution because all zeros mask raises some CuDNN kernel level issue. Refer here
%matplotlib inline
sns.set_style("whitegrid")
lines = pd.read_table('spa.txt', names=['english', 'spanish', 'attributes'])
# lines = pd.DataFrame({"english": ["Juan eats apples"], "spanish": ["Juan come manzanas"], "attributes": ""})
lines.shape
(123770, 3)
lines = lines.drop(columns=['attributes'])
for col in lines.columns:
    # lowercase
    lines[col] = lines[col].apply(lambda x: x.lower())
    # remove quotes
    lines[col] = lines[col].apply(lambda x: re.sub("'", "", x))
    # remove punctuations
    lines[col] = lines[col].apply(lambda x: ''.join(ch for ch in x if ch not in set(string.punctuation)))
    # remove numbers
    remove_digits = str.maketrans('', '', digits)
    lines[col] = lines[col].apply(lambda x: x.translate(remove_digits))
    # remove unnecessary spaces
    lines[col] = lines[col].apply(lambda x: x.strip())
    lines[col] = lines[col].apply(lambda x: re.sub(" +", " ", x))
# Add start and end tokens to target sequences
lines['spanish'] = lines['spanish'].apply(lambda x : 'START_ '+ x + ' _END')
pd.set_option('display.max_colwidth', 100)
lines.head(10)
english spanish
0 go START_ ve _END
1 go START_ vete _END
2 go START_ vaya _END
3 go START_ váyase _END
4 hi START_ hola _END
5 run START_ ¡corre _END
6 run START_ ¡corran _END
7 run START_ ¡corra _END
8 run START_ ¡corred _END
9 run START_ corred _END

Creating Vocabulary

Create vocabulary of english and spanish words

# English Vocab
all_eng_words = set()
for eng in lines['english']:
    for word in eng.split():
        if word not in all_eng_words:
            all_eng_words.add(word)
print(f"English Vocab: {len(all_eng_words)}")
English Vocab: 13475
# Spanish Vocab
all_spa_words = set()
for spa in lines['spanish']:
    for word in spa.split():
        if word not in all_spa_words:
            all_spa_words.add(word)
print(f"Spanish Vocab: {len(all_spa_words)}")
Spanish Vocab: 27264
# Max Length of source sequence
lenght_list_eng=[]
for l in lines['english']:
    lenght_list_eng.append(len(l.split(' ')))
max_length_src = np.max(lenght_list_eng)
print(f"Max Length Sentence (English): {max_length_src}")
Max Length Sentence (English): 47
# Max Length of target sequence
lenght_list_spa=[]
for l in lines['spanish']:
    lenght_list_spa.append(len(l.split(' ')))
max_length_tar = np.max(lenght_list_spa)
print(f"Max Length Sentence (Spanish): {max_length_src}")
Max Length Sentence (Spanish): 47
matches = [i for i, j in zip(lenght_list_eng, lenght_list_spa) if i == j]
print(f"Number of matches: {len(matches)} ({(len(matches)*100/lines.shape[0]):.2f})")
Number of matches: 13865 (11.20)
lines.head()
english spanish
0 go START_ ve _END
1 go START_ vete _END
2 go START_ vaya _END
3 go START_ váyase _END
4 hi START_ hola _END
input_words = sorted(list(all_eng_words))
target_words = sorted(list(all_spa_words))
num_encoder_tokens = len(all_eng_words)
num_decoder_tokens = len(all_spa_words)
num_encoder_tokens, num_decoder_tokens
(13475, 27264)
num_encoder_tokens += 1 # For zero padding
num_decoder_tokens += 1 # For zero padding

Tokenization

def take(n, iterable):
    "Return first n items of the iterable as a list"
    return list(islice(iterable, n))
input_token_index = dict([(word, i+1) for i, word in enumerate(input_words)])
target_token_index = dict([(word, i+1) for i, word in enumerate(target_words)])
n_items = take(10, input_token_index.items())
for k,v in n_items:
    print(k, v)
a 1
aardvark 2
aardvarks 3
aaron 4
aback 5
abandon 6
abandoned 7
abandoning 8
abate 9
abated 10
n_items = take(10, target_token_index.items())
for k,v in n_items:
    print(k, v)
START_ 1
_END 2
a 3
aabe 4
aah 5
aaron 6
abajo 7
abandona 8
abandonada 9
abandonadas 10
reverse_input_char_index = dict((i, word) for word, i in input_token_index.items())
reverse_target_char_index = dict((i, word) for word, i in target_token_index.items())
lines = shuffle(lines)
lines.head(10)
english spanish
36217 you are my best friend START_ eres mi mejor amigo _END
118304 if you do not have this program you can download it now START_ si usted no dispone de este programa puede descargarlo ahora _END
47672 she got married last year START_ se casó el año pasado _END
32053 can you climb the tree START_ ¿puedes trepar al árbol _END
34632 someone stole my money START_ alguien se voló mi plata _END
93415 we have till tomorrow night to decide START_ tenemos hasta mañana por la noche para decidirnos _END
103998 tom claimed he killed mary in selfdefense START_ tom alegó que mató a mary en defensa propia _END
8167 he is not young START_ él no es joven _END
18336 she wants to dance START_ ella quiere bailar _END
112211 im looking for someone who can speak portuguese START_ busco a alguien que sepa portugués _END

Train-Test Split

X, y = lines["english"], lines["spanish"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train.shape, y_train.shape
((99016,), (99016,))
X_test.shape, y_test.shape
((24754,), (24754,))

Generator

def generate_batch(X=X_train, y=y_train, batch_size=128):
    ''' Generate a batch of data '''
    while True:
        for j in range(0, len(X), batch_size):
            encoder_input_data = np.zeros((batch_size, max_length_src), dtype='float32')
            decoder_input_data = np.zeros((batch_size, max_length_tar), dtype='float32')
            decoder_target_data = np.zeros((batch_size, max_length_tar, num_decoder_tokens), dtype='float32')
            for i, (input_text, target_text) in enumerate(zip(X[j:j+batch_size], y[j:j+batch_size])):
                for t, word in enumerate(input_text.split()):
                    encoder_input_data[i, t] = input_token_index[word] # encoder input seq
                for t, word in enumerate(target_text.split()):
                    if t < len(target_text.split())-1:
                        decoder_input_data[i, t] = target_token_index[word] # decoder input seq
                    if t>0:
                        # decoder target sequence (one hot encoded)
                        # does not include the START_ token
                        # Offset by one timestep
                        decoder_target_data[i, t - 1, target_token_index[word]] = 1.
            yield([encoder_input_data, decoder_input_data], decoder_target_data)

Teacher Forcing

Teacher forcing works by using the actual or expected output from the training dataset at the current time step y(t) as input in the next time step X(t+1), rather than the output generated by the network.

Decoder is trained to turn the target sequences into the same sequences but offset by one timestep in the future, a training process called "teacher forcing" in this context. Effectively, the decoder learns to generate targets [t+1...] given targets [...t], conditioned on the input sequence.

Example -

Suppose, we had only 1 sentence -

  • English - Juan eats apples
  • Spanish - Juan come manzanas

Hence, we had just 3 words in our English & 5 in Spanish vocabulary.

English Vocabulary
{'apples': 1, 'eats': 2, 'juan': 3}

Spanish Vocabulary
{'START_': 1, '_END': 2, 'come': 3, 'juan': 4, 'manzanas': 5}

So our encoded input & decoder input would look like -

Encoder Input Data: [[3. 2. 1.]]

Decoder Input Data: [[1. 4. 3. 5. 0.]]

As the target sentence has 5 words, at timestep t during training, we set the previous timestep's t-1 actual output to 1. So essentially, we will have 5 target sentence.

Decoder Target Data: 
[0. 0. 0. 0. 1. 0.] # juan
[0. 0. 0. 1. 0. 0.] # come
[0. 0. 0. 0. 0. 1.] # manzanas
[0. 0. 1. 0. 0. 0.] # _END
[0. 0. 0. 0. 0. 0.]

Summary

TS1 -

Encoder Input Data - [3. 2. 1.]
Decoder Input Data: [1. 4. 3. 5. 0.]
Decoder Target Data: [0. 0. 0. 0. 1. 0.] # juan

TS2 -

Encoder Input Data - [3. 2. 1.]
Decoder Input Data: [1. 4. 3. 5. 0.]
Decoder Target Data: [0. 0. 0. 1. 0. 0.] # juan come

TS3 -

Encoder Input Data - [3. 2. 1.]
Decoder Input Data: [1. 4. 3. 5. 0.]
Decoder Target Data: [0. 0. 0. 0. 0. 1.] # juan come manzanas

TS4 -

Encoder Input Data - [3. 2. 1.]
Decoder Input Data: [1. 4. 3. 5. 0.]
Decoder Target Data: [0. 1. 0. 0. 0. 0.] # juan come manzanas _END

Model

latent_dim = 100
# ENCODER
encoder_inputs = Input(shape=(None,))
enc_emb = Embedding(num_encoder_tokens, latent_dim, mask_zero=True)(encoder_inputs)
encoder_lstm = LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder_lstm(enc_emb)
# discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]

mask_zero=True - It treats '0' as a padding value. As per the docs, "If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1)". Which is why increased num_encoder_tokens & num_decoder_tokens in cell 20

# set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(None,))
dec_emb_layer = Embedding(num_decoder_tokens, latent_dim, mask_zero=True)
dec_emb = dec_emb_layer(decoder_inputs)
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, dec_state_h, dec_state_c = decoder_lstm(dec_emb, initial_state=encoder_states)
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

Here, we add a Dense Layer that uses softmax activation on top of decoder. Notice, how for the sample sentence - Juan eats apples, the output target at each timestep looks like -

[0. 0. 0. 0. 1. 0.] # juan
[0. 0. 0. 1. 0. 0.] # juan come
[0. 0. 0. 0. 0. 1.] # juan come manzanas
[0. 1. 0. 0. 0. 0.] # juan come manzanas _end

It is the job of the dense layer to predict this next word from the decoder_outputs

Model will take encoder inputs & decoder inputs and return decoder outputs

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc'])
plot_model(model, show_shapes=True)

Training

train_samples = len(X_train)
val_samples = len(X_test)
batch_size = 128
epochs = 50
csvlogger = CSVLogger("training.log")
earlystopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)
callbacks = [csvlogger, earlystopping]
history = model.fit_generator(generator = generate_batch(X_train, y_train, batch_size = batch_size),
                    steps_per_epoch = train_samples//batch_size,
                    epochs=epochs,
                    validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
                    validation_steps = val_samples//batch_size,
                    callbacks=callbacks
                    )
WARNING:tensorflow:From <ipython-input-42-c7455d30101d>:6: Model.fit_generator (from tensorflow.python.keras.engine.training_v1) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.
Epoch 1/50
773/773 [==============================] - 434s 561ms/step - loss: 0.8506 - acc: 0.1638 - val_loss: 0.7588 - val_acc: 0.2107
Epoch 2/50
773/773 [==============================] - 436s 564ms/step - loss: 0.6943 - acc: 0.2561 - val_loss: 0.6517 - val_acc: 0.2900
Epoch 3/50
773/773 [==============================] - 435s 563ms/step - loss: 0.6062 - acc: 0.3210 - val_loss: 0.5916 - val_acc: 0.3357
Epoch 4/50
773/773 [==============================] - 434s 561ms/step - loss: 0.5472 - acc: 0.3637 - val_loss: 0.5486 - val_acc: 0.3686
Epoch 5/50
773/773 [==============================] - 434s 562ms/step - loss: 0.5002 - acc: 0.3994 - val_loss: 0.5149 - val_acc: 0.3968
Epoch 6/50
773/773 [==============================] - 433s 561ms/step - loss: 0.4612 - acc: 0.4302 - val_loss: 0.4895 - val_acc: 0.4196
Epoch 7/50
773/773 [==============================] - 435s 563ms/step - loss: 0.4284 - acc: 0.4568 - val_loss: 0.4689 - val_acc: 0.4367
Epoch 8/50
773/773 [==============================] - 435s 563ms/step - loss: 0.4000 - acc: 0.4807 - val_loss: 0.4523 - val_acc: 0.4519
Epoch 9/50
773/773 [==============================] - 436s 564ms/step - loss: 0.3749 - acc: 0.5024 - val_loss: 0.4377 - val_acc: 0.4657
Epoch 10/50
773/773 [==============================] - 436s 564ms/step - loss: 0.3524 - acc: 0.5225 - val_loss: 0.4265 - val_acc: 0.4763
Epoch 11/50
773/773 [==============================] - 437s 565ms/step - loss: 0.3322 - acc: 0.5404 - val_loss: 0.4169 - val_acc: 0.4862
Epoch 12/50
773/773 [==============================] - 437s 565ms/step - loss: 0.3137 - acc: 0.5580 - val_loss: 0.4084 - val_acc: 0.4944
Epoch 13/50
773/773 [==============================] - 435s 563ms/step - loss: 0.2968 - acc: 0.5741 - val_loss: 0.4029 - val_acc: 0.5006
Epoch 14/50
773/773 [==============================] - 435s 562ms/step - loss: 0.2812 - acc: 0.5896 - val_loss: 0.3967 - val_acc: 0.5082
Epoch 15/50
773/773 [==============================] - 434s 562ms/step - loss: 0.2669 - acc: 0.6042 - val_loss: 0.3921 - val_acc: 0.5133
Epoch 16/50
773/773 [==============================] - 434s 561ms/step - loss: 0.2537 - acc: 0.6189 - val_loss: 0.3874 - val_acc: 0.5192
Epoch 17/50
773/773 [==============================] - 434s 562ms/step - loss: 0.2413 - acc: 0.6333 - val_loss: 0.3843 - val_acc: 0.5247
Epoch 18/50
773/773 [==============================] - 435s 562ms/step - loss: 0.2301 - acc: 0.6472 - val_loss: 0.3814 - val_acc: 0.5277
Epoch 19/50
773/773 [==============================] - 438s 566ms/step - loss: 0.2199 - acc: 0.6596 - val_loss: 0.3797 - val_acc: 0.5306
Epoch 20/50
773/773 [==============================] - 436s 564ms/step - loss: 0.2105 - acc: 0.6717 - val_loss: 0.3777 - val_acc: 0.5337
Epoch 21/50
773/773 [==============================] - 434s 562ms/step - loss: 0.2015 - acc: 0.6836 - val_loss: 0.3774 - val_acc: 0.5347
Epoch 22/50
773/773 [==============================] - 434s 561ms/step - loss: 0.1934 - acc: 0.6950 - val_loss: 0.3774 - val_acc: 0.5353
Epoch 23/50
773/773 [==============================] - 434s 561ms/step - loss: 0.1859 - acc: 0.7051 - val_loss: 0.3753 - val_acc: 0.5392
Epoch 24/50
773/773 [==============================] - 433s 560ms/step - loss: 0.1788 - acc: 0.7154 - val_loss: 0.3758 - val_acc: 0.5415
Epoch 25/50
773/773 [==============================] - 433s 561ms/step - loss: 0.1724 - acc: 0.7238 - val_loss: 0.3782 - val_acc: 0.5405
Epoch 26/50
773/773 [==============================] - 433s 561ms/step - loss: 0.1664 - acc: 0.7326 - val_loss: 0.3779 - val_acc: 0.5428

EarlyStopping callback clicked into gear at the end of epoch 26, as the validation loss only kept on increasing from 0.3753(epoch 23) to 0.3758(epoch 24), 0.3782(epoch 25) & 0.3779(epoch 26)

# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
model.save("english_to_spanish_nmt.h5")
model = load_model("english_to_spanish_nmt.h5")
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
files.download("english_to_spanish_nmt.h5")
files.download("training.log")

Inference

Encoder Setup

Encode the input sequence to get the encoder_states - state_h & state_c

encoder_model = Model(encoder_inputs, encoder_states)

Decoder setup

Below tensors will hold the states of the previous time step. In case of the first sequence, assume -

  1. decoder_state_input_c - state_c
  2. decoder_state_input_h - state_h
decoder_state_input_h = Input(shape=(latent_dim,))
decoder_state_input_c = Input(shape=(latent_dim,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

Get the embedding of decoder output sequences. For the first sequence, it will return the embedded vector for START_ - [1., 0., 0., 0., 0.]. If the next predicted word is Juan, it will then return the embedded vector for Juan - [0., 0., 0., 1., 0.]

dec_emb2 = dec_emb_layer(decoder_inputs) 

To predict the next word in the sequence, set the initial states to the states from the previous time step

decoder_outputs2, state_h2, state_c2 = decoder_lstm(dec_emb2, initial_state=decoder_states_inputs)

Predict the next word in the sequence using the dense layer and choose the most probable word by selecting the word with most probability from the softmax probability distribution.

decoder_outputs2 = decoder_dense(decoder_outputs2) 

Final Decoder Model

Inputs -

  1. decoder_inputs - List of word
  2. decoder_states_inputs - previous timestep's hidden state & cell state

Outputs -

  1. decoder_outputs2 - one-hot vector represeting the predicted word
  2. decoder_states2 - current timestep's hidden state & cell state
decoder_states2 = [state_h2, state_c2]
decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs2] + decoder_states2)

Decode Sequence

# https://github.com/numpy/numpy/issues/15201#issue-543733072

def categorical(p):
    return (p.cumsum(-1) >= np.random.uniform(size=p.shape[:-1])[..., None]).argmax(-1)
def decode_sequence(input_seq):
    # Encode the input as state vectors.
    states_value = encoder_model.predict(input_seq)
    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1,1))
    # Populate the first character of target sequence with the start character.
    target_seq[0, 0] = target_token_index['START_']

    # Sampling loop for a batch of sequences
    # (to simplify, here we assume a batch of size 1).
    stop_condition = False
    decoded_sentence = ''
    while not stop_condition:
        output_tokens, h, c = decoder_model.predict([target_seq] + states_value)

        # Sampling a token with max probability
        sampled_token_index = np.argmax(output_tokens[0, -1, :])

        # Sample from a categorical distribution
        # logits = output_tokens[0, -1, :]
        # sampled_token_index = categorical(np.reshape(logits, [-1, len(logits)]))[0]
         
        sampled_char = reverse_target_char_index[sampled_token_index]
        decoded_sentence += ' '+sampled_char

        # Exit condition: either hit max length
        # or find stop character.
        if (sampled_char == '_END' or
           len(decoded_sentence) > max_length_tar):
            stop_condition = True

        # Update the target sequence (of length 1).
        target_seq = np.zeros((1,1))
        target_seq[0, 0] = sampled_token_index

        # Update states
        states_value = [h, c]

    return decoded_sentence

Beam Search Decoding

Core idea is to keep track of the $k$ most probable partial translations. $k$ is the beam width (usually 5-10).

A hypothesis $y_{1}, y_{2}, y_{3}, ..., y_{t}$ has a score which is it's log probability:

$$score(y_{1}, y_{2}, y_{3}, ..., y_{t})=\log P(y_{1}, y_{2}, y_{3}, ..., y_{t}|x)=\sum_{i=1}^{t}\log P(y_{i}|y_{1}, y_{2}, y_{3}, ..., y_{i-1}|x)$$

  • Scores are all negative, as we are taking log of probabilities (0-1)
  • We search for high scoring hypothesis, keeping track of top k only at each step

STOPPING CRITERIA

Since, different hypothesis may produce $<END>$ token at different timesteps. Therefore,

  1. Once a hypothesis produces $<END>$ token, we regard it complete.
  2. We then discard it and continue exploring other hypothesis.

We usually stop when,

  1. We have say n words or sequence is say of length 50 or 100, etc.
  2. We have certain number of completed hypothesis.
def beam_search_decoder(predictions, top_k = 3):
    #start with an empty sequence with zero score
    output_sequences = [([], 0)]
    
    #looping through all the predictions
    for token_probs in predictions:
        new_sequences = []
        
        #append new tokens to old sequences and re-score
        for old_seq, old_score in output_sequences:
            for char_index in range(len(token_probs)):
                new_seq = old_seq + [char_index]
                #considering log-likelihood for scoring
                new_score = old_score + math.log(token_probs[char_index])
                new_sequences.append((new_seq, new_score))
                
        # sort all new sequences in the de-creasing order of their score
        output_sequences = sorted(new_sequences, key = lambda val: val[1], reverse = True)
        
        #select top-k based on score 
        # *Note- best sequence is with the highest score
        output_sequences = output_sequences[:top_k]
        
    return output_sequences
def decode_sequence_beam_search(input_seq):
    probabilities = []
    # Encode the input as state vectors.
    states_value = encoder_model.predict(input_seq)
    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1,1))
    # Populate the first character of target sequence with the start character.
    target_seq[0, 0] = target_token_index['START_']

    # Sampling loop for a batch of sequences
    # (to simplify, here we assume a batch of size 1).
    stop_condition = False
    decoded_sentence = ''
    while not stop_condition:
        output_tokens, h, c = decoder_model.predict([target_seq] + states_value)

        # Sampling a token with max probability
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        probabilities.append(output_tokens[0, -1, :])
         
        sampled_char = reverse_target_char_index[sampled_token_index]
        decoded_sentence += ' '+sampled_char

        # Exit condition: either hit max length
        # or find stop character.
        if (sampled_char == '_END' or
           len(decoded_sentence) > max_length_tar):
            stop_condition = True

        # Update the target sequence (of length 1).
        target_seq = np.zeros((1,1))
        target_seq[0, 0] = sampled_token_index

        # Update states
        states_value = [h, c]

    # storing multiple results
    outputs = []
    beam_search_preds = beam_search_decoder(probabilities, top_k = 10)
    for prob_indexes, score in beam_search_preds:
        decoded_sentence = ''
        for index in prob_indexes:
            sampled_char = reverse_target_char_index[index]
            decoded_sentence += ' '+sampled_char
            if (sampled_char == '_END' or len(decoded_sentence) > max_length_tar):
                break
        outputs.append(decoded_sentence)

    return outputs

Utility Function

Function that makes a request at My Memory Translated to get back the English translation of predicted Spanish sentence.

url = "https://api.mymemory.translated.net/get"

def get_translation(seq):
    data = {}
    data["q"] = seq
    data["langpair"] = "es|en"
    response = requests.post(url, data=data)
    translated_text = response.json()["responseData"]["translatedText"]
    return translated_text

White Distance

Metric to find similarity between two sentences

def upper_case(s):
    return s.upper()

def get_pairs(s):
    pairs = []
    words = s.strip().split(' ')
    for word in words:
        for idx in range(len(word)-1):
            pairs.append(word[idx:idx+2])
    return pairs

def get_similarity(s1, s2):
    s1 = upper_case(s1)
    s2 = upper_case(s2)
    p1 = get_pairs(s1)
    p2 = get_pairs(s2)
    nr = 2*len(list((Counter(p1) & Counter(p2)).elements()))
    dr = len(p1)+len(p2)
    return nr/dr

Training Data

train_gen = generate_batch(X_train, y_train, batch_size = 1)
k=-1
for _ in range(20):
    k+=1
    (input_seq, actual_output), _ = next(train_gen)
    decoded_sentence = decode_sequence(input_seq)
    print('Input Sentence:', X_train[k:k+1].values[0])
    print('Actual Translation:', y_train[k:k+1].values[0][6:-4])
    print('Predicted Translation (Spanish):', decoded_sentence[:-4])
    # predicted spanish sequence back to english
    print('Predicted Translation (English):', get_translation(decoded_sentence[:-4]))
    print("="*60, end="\n\n")
Input Sentence: do you know how to drive
Actual Translation:  ¿sabes conducir 
Predicted Translation (Spanish):  ¿sabes cómo se levante 
Predicted Translation (English): Do you know how to get up
============================================================

Input Sentence: it never happened
Actual Translation:  nunca pasó 
Predicted Translation (Spanish):  nunca sucedió 
Predicted Translation (English): It never happened.
============================================================

Input Sentence: i havent slept in days
Actual Translation:  no he dormido en días 
Predicted Translation (Spanish):  no dormí en dos años 
Predicted Translation (English): I did not sleep in two years
============================================================

Input Sentence: i hope you enjoy your flight
Actual Translation:  espero que disfrute del vuelo 
Predicted Translation (Spanish):  espero que te vayas a la leche 
Predicted Translation (English): I hope you go to milk
============================================================

Input Sentence: the doctor advised him to give up smoking
Actual Translation:  el médico le aconsejó que dejara de fumar 
Predicted Translation (Spanish):  el médico le aconsejó que dejara de fumar 
Predicted Translation (English): The doctor advised him to give up smoking.
============================================================

Input Sentence: tom sees things
Actual Translation:  tom ve cosas 
Predicted Translation (Spanish):  tom ve cosas 
Predicted Translation (English): Tom sees things.
============================================================

Input Sentence: do we have enough flour
Actual Translation:  ¿tenemos suficiente harina 
Predicted Translation (Spanish):  ¿tenemos suficiente para beber 
Predicted Translation (English): Do we have enough to drink
============================================================

Input Sentence: tom crawled into bed just before midnight
Actual Translation:  tom se arrastró a la cama justo antes de medianoche 
Predicted Translation (Spanish):  tom me quité la casa por un paseo 
Predicted Translation (English): tom i took off the house for a walk
============================================================

Input Sentence: im already sick
Actual Translation:  ya estoy enferma 
Predicted Translation (Spanish):  ya estoy enfermo 
Predicted Translation (English): I am sick
============================================================

Input Sentence: what will you do when you grow up
Actual Translation:  ¿qué harás cuando seas mayor 
Predicted Translation (Spanish):  ¿qué vas a poner en cuando crezcas 
Predicted Translation (English): What are you going to put in when you grow up
============================================================

Input Sentence: could you send me a brochure
Actual Translation:  ¿podrías enviarme un folleto 
Predicted Translation (Spanish):  ¿podrías enviarme un catálogo 
Predicted Translation (English): Could you send me a catalog
============================================================

Input Sentence: i assure you that an error like this will never happen again
Actual Translation:  te aseguro que un error así no sucederá nunca más 
Predicted Translation (Spanish):  te aseguro que no he leído ni un error por favor 
Predicted Translation (English): I assure you that I have not read a mistake please
============================================================

Input Sentence: tom shut himself up in his bedroom
Actual Translation:  tom se encerró en su cuarto 
Predicted Translation (Spanish):  tom se encerró en su habitación 
Predicted Translation (English): He locked himself up in his room.
============================================================

Input Sentence: get out
Actual Translation:  bájate 
Predicted Translation (Spanish):  salid 
Predicted Translation (English): come out
============================================================

Input Sentence: tom had a hunch that mary was seeing someone else
Actual Translation:  tom tuvo un presentimiento de que mary estaba viendo a alguien más 
Predicted Translation (Spanish):  tom tuvo una foto de que mary vino a verlo 
Predicted Translation (English): tom had a picture of mary coming to see him
============================================================

Input Sentence: tom unlocked his briefcase
Actual Translation:  tom abrió los cerrojos de su maletín 
Predicted Translation (Spanish):  tom abrió la foto 
Predicted Translation (English): tom opened the photo
============================================================

Input Sentence: how does the film end
Actual Translation:  ¿cómo termina la película 
Predicted Translation (Spanish):  ¿cómo se llama esa película 
Predicted Translation (English): What&#39;s that movie called
============================================================

Input Sentence: a high school student made this robot
Actual Translation:  un estudiante de enseñanza media hizo este robot 
Predicted Translation (Spanish):  un estudiante de la lluvia es muy grande 
Predicted Translation (English): a rain student is very big
============================================================

Input Sentence: he is a good violinist
Actual Translation:  él es un buen violinista 
Predicted Translation (Spanish):  él es un buen nadador 
Predicted Translation (English): he is a good swimmer
============================================================

Input Sentence: what time do you get up on schooldays
Actual Translation:  ¿a qué hora te levantas en días de clase 
Predicted Translation (Spanish):  ¿a qué hora te levantas en australia 
Predicted Translation (English): What time do you get up in australia
============================================================

train_gen = generate_batch(X_train, y_train, batch_size = 1)
k=-1
for _ in range(20):
    k+=1
    similarity_scores = []
    (input_seq, actual_output), _ = next(train_gen)
    decoded_sentences = decode_sequence_beam_search(input_seq)
    acutal_sentence = y_train[k:k+1].values[0][6:-4]
    print('Input Sentence:', X_train[k:k+1].values[0])
    print('Actual Translation:', acutal_sentence)
    for idx, pred in enumerate(decoded_sentences):
        similarity_scores.append(get_similarity(pred, acutal_sentence))
    dictionary = dict(zip(similarity_scores, decoded_sentences))
    dictionary = {k: v for k, v in sorted(dictionary.items(), 
                                          key=lambda item: item[1], 
                                          reverse=True)}
    closest_sentence = decoded_sentences[np.argmax(similarity_scores)]
    print(f"Closest Predicted Sentence (Spanish): {closest_sentence[:-4]}")
    print(f"Closest Predicted Sentence (English): {get_translation(closest_sentence[:-4])}", end="\n\n")
    decoded_sentences.remove(closest_sentence)
    for idx, pred in enumerate(list(dictionary.values())[:5]):
        print(f'Predicted Translation {idx}: {pred[:-4]}')
    print("="*30, end="\n\n")
Input Sentence: do you know how to drive
Actual Translation:  ¿sabes conducir 
Closest Predicted Sentence (Spanish):  ¿sabes conducir se levante 
Closest Predicted Sentence (English): Do you know drive up get up

Predicted Translation 0:  ¿sabes cómo se ponga 
Predicted Translation 1:  ¿sabes cómo se muevan 
Predicted Translation 2:  ¿sabes cómo se levante 
Predicted Translation 3:  ¿sabes cómo manejar levante 
Predicted Translation 4:  ¿sabes cómo conducir ponga 
==============================

Input Sentence: it never happened
Actual Translation:  nunca pasó 
Closest Predicted Sentence (Spanish):  nunca pasó 
Closest Predicted Sentence (English): It never happened.

Predicted Translation 0:  nunca se 
Predicted Translation 1:  nunca pasó 
Predicted Translation 2:  nunca ocurrió 
Predicted Translation 3:  nunca hizo 
Predicted Translation 4:  nunca fue 
==============================

Input Sentence: i havent slept in days
Actual Translation:  no he dormido en días 
Closest Predicted Sentence (Spanish):  no dormí en dos días 
Closest Predicted Sentence (English): I did not sleep in two days

Predicted Translation 0:  no me nada dos años 
Predicted Translation 1:  no me en dos años 
Predicted Translation 2:  no he nada dos años 
Predicted Translation 3:  no he en dos años 
Predicted Translation 4:  no duermo nada dos años 
==============================

Input Sentence: i hope you enjoy your flight
Actual Translation:  espero que disfrute del vuelo 
Closest Predicted Sentence (Spanish):  espero que disfrute vayas a la leche 
Closest Predicted Sentence (English): I hope you enjoy going to milk

Predicted Translation 0:  espero que usted vayas a la leche 
Predicted Translation 1:  espero que te vayas a la clase 
Predicted Translation 2:  espero que te pongas a la clase 
Predicted Translation 3:  espero que te la a la clase 
Predicted Translation 4:  espero que disfrute vayas a la clase 
==============================

Input Sentence: the doctor advised him to give up smoking
Actual Translation:  el médico le aconsejó que dejara de fumar 
Closest Predicted Sentence (Spanish):  el médico le aconsejó que dejara de fumar 
Closest Predicted Sentence (English): The doctor advised him to give up smoking.

Predicted Translation 0:  el médico le dijo que dejara de fumar 
Predicted Translation 1:  el médico le aconsejó que dejara la fumar 
Predicted Translation 2:  el médico le aconsejó que dejara de fumar 
Predicted Translation 3:  el médico le aconsejó que dejara a fumar 
Predicted Translation 4:  el doctor le aconsejó que dejara lo fumar 
==============================

Input Sentence: tom sees things
Actual Translation:  tom ve cosas 
Closest Predicted Sentence (Spanish):  tom ve cosas 
Closest Predicted Sentence (English): Tom sees things.

Predicted Translation 0:  ¡tom ve cosas 
Predicted Translation 1:  tomás ve cosas 
Predicted Translation 2:  tom ve las 
Predicted Translation 3:  tom ve cosas 
Predicted Translation 4:  tom ve aquí 
==============================

Input Sentence: do we have enough flour
Actual Translation:  ¿tenemos suficiente harina 
Closest Predicted Sentence (Spanish):  ¿tenemos suficiente harina beber 
Closest Predicted Sentence (English): Do we have enough flour to drink

Predicted Translation 0:  ¿tenemos suficiente para un 
Predicted Translation 1:  ¿tenemos suficiente para principiantes 
Predicted Translation 2:  ¿tenemos suficiente para otra 
Predicted Translation 3:  ¿tenemos suficiente para escribir 
Predicted Translation 4:  ¿tenemos suficiente para beber 
==============================

Input Sentence: tom crawled into bed just before midnight
Actual Translation:  tom se arrastró a la cama justo antes de medianoche 
Closest Predicted Sentence (Spanish):  tom me quité la casa por un rato 
Closest Predicted Sentence (English): tom i took off the house for a while

Predicted Translation 0:  tom me quité la casa por un rato 
Predicted Translation 1:  tom me quité la casa por un paso 
Predicted Translation 2:  tom me quité la casa por un paseo 
Predicted Translation 3:  tom me quité la casa por paso paso 
Predicted Translation 4:  tom me quité la casa por paso paseo 
==============================

Input Sentence: im already sick
Actual Translation:  ya estoy enferma 
Closest Predicted Sentence (Spanish):  ya estoy enferma 
Closest Predicted Sentence (English): I am sick.

Predicted Translation 0:  ¡estoy estoy enfermo 
Predicted Translation 1:  yo estoy enfermo 
Predicted Translation 2:  yo estoy enferma 
Predicted Translation 3:  ya me enferma 
Predicted Translation 4:  ya estoy enferma 
==============================

Input Sentence: what will you do when you grow up
Actual Translation:  ¿qué harás cuando seas mayor 
Closest Predicted Sentence (Spanish):  ¿qué vas a quedar en cuando 
Closest Predicted Sentence (English): What are you going to stay on when

Predicted Translation 0:  ¿qué vas a quedar en cuando crezcas 
Predicted Translation 1:  ¿qué vas a quedar en cuando 
Predicted Translation 2:  ¿qué vas a poner en las crezcas 
Predicted Translation 3:  ¿qué vas a poner en cuando 
Predicted Translation 4:  ¿qué vas a estas en cuando crezcas 
==============================

Input Sentence: could you send me a brochure
Actual Translation:  ¿podrías enviarme un folleto 
Closest Predicted Sentence (Spanish):  ¿podrías enviarme un folleto 
Closest Predicted Sentence (English): Could you send me a brochure?

Predicted Translation 0:  ¿puedes enviarme un folleto 
Predicted Translation 1:  ¿puedes enviarme un catálogo 
Predicted Translation 2:  ¿podrías un un catálogo 
Predicted Translation 3:  ¿podrías traerme un catálogo 
Predicted Translation 4:  ¿podrías enviarme un folleto 
==============================

Input Sentence: i assure you that an error like this will never happen again
Actual Translation:  te aseguro que un error así no sucederá nunca más 
Closest Predicted Sentence (Spanish):  te aseguro que no hace leído un un error de favor 
Closest Predicted Sentence (English): I assure you that you do not read an error in favor

Predicted Translation 0:  te aseguro que no lo leído ni un error por favor 
Predicted Translation 1:  te aseguro que no he leído un un error por favor 
Predicted Translation 2:  te aseguro que no he leído un un error de favor 
Predicted Translation 3:  te aseguro que no he leído ni un periódico por f
Predicted Translation 4:  te aseguro que no he leído ni un error de favor 
==============================

Input Sentence: tom shut himself up in his bedroom
Actual Translation:  tom se encerró en su cuarto 
Closest Predicted Sentence (Spanish):  tom se encerró en su cuarto 
Closest Predicted Sentence (English): Tom locked himself in his room.

Predicted Translation 0:  tom se quedó en su habitación 
Predicted Translation 1:  tom se ocultó en su habitación 
Predicted Translation 2:  tom se escondió en su habitación 
Predicted Translation 3:  tom se enjuagó en su habitación 
Predicted Translation 4:  tom se encerró en su pieza 
==============================

Input Sentence: get out
Actual Translation:  bájate 
Closest Predicted Sentence (Spanish):  bájate 
Closest Predicted Sentence (English): Get down.

Predicted Translation 0:  ¡vete 
Predicted Translation 1:  ¡quédate 
Predicted Translation 2:  ¡despierta 
Predicted Translation 3:  mantente 
Predicted Translation 4:  bájate 
==============================

Input Sentence: tom had a hunch that mary was seeing someone else
Actual Translation:  tom tuvo un presentimiento de que mary estaba viendo a alguien más 
Closest Predicted Sentence (Spanish):  tom tuvo una foto de que mary estaba a verlo 
Closest Predicted Sentence (English): Tom had a picture that Mary was seeing him

Predicted Translation 0:  tom tuvo una reunión de que mary vino a verlo 
Predicted Translation 1:  tom tuvo una reunión de mary mary vino a verlo 
Predicted Translation 2:  tom tuvo una foto de que mary vino a verlo 
Predicted Translation 3:  tom tuvo una foto de que mary vino a la 
Predicted Translation 4:  tom tuvo una foto de que mary lo a verlo 
==============================

Input Sentence: tom unlocked his briefcase
Actual Translation:  tom abrió los cerrojos de su maletín 
Closest Predicted Sentence (Spanish):  tom abrió la imagen 
Closest Predicted Sentence (English): tom opened the image

Predicted Translation 0:  tom abrió la primera 
Predicted Translation 1:  tom abrió la oportunidad 
Predicted Translation 2:  tom abrió la llave 
Predicted Translation 3:  tom abrió la imagen 
Predicted Translation 4:  tom abrió la fotografía 
==============================

Input Sentence: how does the film end
Actual Translation:  ¿cómo termina la película 
Closest Predicted Sentence (Spanish):  ¿cómo se llama esa película 
Closest Predicted Sentence (English): What&#39;s that movie called

Predicted Translation 0:  ¿cómo se ve esa película 
Predicted Translation 1:  ¿cómo se va esta película 
Predicted Translation 2:  ¿cómo se utiliza esa película 
Predicted Translation 3:  ¿cómo se recoge esa película 
Predicted Translation 4:  ¿cómo se llamaba esa película 
==============================

Input Sentence: a high school student made this robot
Actual Translation:  un estudiante de enseñanza media hizo este robot 
Closest Predicted Sentence (Spanish):  un estudiante de la lluvia está muy grande 
Closest Predicted Sentence (English): a rain student is very big

Predicted Translation 0:  un estudiante de la lluvia no muy grande 
Predicted Translation 1:  un estudiante de la lluvia está muy grande 
Predicted Translation 2:  un estudiante de la lluvia es muy grande 
Predicted Translation 3:  un estudiante de la lluvia es muy de 
Predicted Translation 4:  un estudiante de la lluvia es muy bueno 
==============================

Input Sentence: he is a good violinist
Actual Translation:  él es un buen violinista 
Closest Predicted Sentence (Spanish):  él es un buen violinista 
Closest Predicted Sentence (English): He is a good violinist.

Predicted Translation 0:  él es un buen violinista 
Predicted Translation 1:  él es un buen vecino 
Predicted Translation 2:  él es un buen pianista 
Predicted Translation 3:  él es un buen nadador 
Predicted Translation 4:  él es un buen escritor 
==============================

Input Sentence: what time do you get up on schooldays
Actual Translation:  ¿a qué hora te levantas en días de clase 
Closest Predicted Sentence (Spanish):  ¿a qué hora te levantas en las 
Closest Predicted Sentence (English): What time do you get up

Predicted Translation 0:  ¿a qué hora te levantas por el 
Predicted Translation 1:  ¿a qué hora te levantas por australia 
Predicted Translation 2:  ¿a qué hora te levantas en unos 
Predicted Translation 3:  ¿a qué hora te levantas en las 
Predicted Translation 4:  ¿a qué hora te levantas en el 
==============================

The combination of White Distance and Beam search with width 10 definitely. For exmaple - compare the result of greedy search (previous cell) vs the above combination (this cell) for some of the sentences. You would notice, that we get more and more closer to the acutal translation.

Example 1

# greedy 
Input Sentence: what time do you get up on schooldays
Actual Translation:  ¿a qué hora te levantas en días de clase 
Predicted Translation (Spanish):  ¿a qué hora te levantas en australia 
Predicted Translation (English): What time do you get up in australia
# beam search + white distance
Input Sentence: what time do you get up on schooldays
Actual Translation:  ¿a qué hora te levantas en días de clase 
Closest Predicted Sentence (Spanish):  ¿a qué hora te levantas en las 
Closest Predicted Sentence (English): What time do you get up

Example 2

# greedy
Input Sentence: he is a good violinist
Actual Translation:  él es un buen violinista 
Predicted Translation (Spanish):  él es un buen nadador 
Predicted Translation (English): he is a good swimmer
# beam search + white distance
Input Sentence: he is a good violinist
Actual Translation:  él es un buen violinista 
Closest Predicted Sentence (Spanish):  él es un buen violinista 
Closest Predicted Sentence (English): He is a good violinist.

Example 3

# greedy
Input Sentence: i havent slept in days
Actual Translation:  no he dormido en días 
Predicted Translation (Spanish):  no dormí en dos años 
Predicted Translation (English): I did not sleep in two years
# beam search + white distance
Input Sentence: i havent slept in days
Actual Translation:  no he dormido en días 
Closest Predicted Sentence (Spanish):  no dormí en dos días 
Closest Predicted Sentence (English): I did not sleep in two days

Here, days seems to be correct instead of years

Example 4

# greedy
Input Sentence: could you send me a brochure
Actual Translation:  ¿podrías enviarme un folleto 
Predicted Translation (Spanish):  ¿podrías enviarme un catálogo 
Predicted Translation (English): Could you send me a catalog
# beam search + white distance
Input Sentence: could you send me a brochure
Actual Translation:  ¿podrías enviarme un folleto 
Closest Predicted Sentence (Spanish):  ¿podrías enviarme un folleto 
Closest Predicted Sentence (English): Could you send me a brochure?

Testing Data

val_gen = generate_batch(X_test, y_test, batch_size = 1)
k=-1
for _ in range(20):
    k+=1
    (input_seq, actual_output), _ = next(val_gen)
    decoded_sentence = decode_sequence(input_seq)
    print('Input Sentence:', X_test[k:k+1].values[0])
    print('Actual Translation:', y_test[k:k+1].values[0][6:-4])
    print('Predicted Translation (Spanish):', decoded_sentence[:-4])
    # predicted spanish sequence back to english
    print('Predicted Translation (English):', get_translation(decoded_sentence[:-4]))
    print("="*60, end="\n\n")
Input Sentence: we work every day but sunday
Actual Translation:  trabajamos todos los días excepto el domingo 
Predicted Translation (Spanish):  trabajamos todos los días excepto los domingos 
Predicted Translation (English): We work every day but Sunday.
============================================================

Input Sentence: does tom want a car
Actual Translation:  ¿tomás quiere un auto 
Predicted Translation (Spanish):  ¿tom quiere un coche 
Predicted Translation (English): Does tom want a car
============================================================

Input Sentence: i was surprised that tom spoke french so well
Actual Translation:  me sorprendió que tomás hablase francés tan bien 
Predicted Translation (Spanish):  me sorprendió que tom estaba y yo nunca más 
Predicted Translation (English): I was surprised that Tom was and I was never again
============================================================

Input Sentence: tom asked mary for help
Actual Translation:  tom le pidió ayuda a mary 
Predicted Translation (Spanish):  tom le pidió ayuda a mary 
Predicted Translation (English): Tom asked Mary for help.
============================================================

Input Sentence: say cheese
Actual Translation:  di patata 
Predicted Translation (Spanish):  decid patata 
Predicted Translation (English): Say cheese.
============================================================

Input Sentence: some of these young people have legs twice as long as mine
Actual Translation:  algunos de estos jóvenes tienen las piernas el doble de largas que las mías 
Predicted Translation (Spanish):  algunas personas murieron mucho de veces más tarde
Predicted Translation (English): some people died a lot of times later
============================================================

Input Sentence: i dont care what she eats
Actual Translation:  no me interesa lo que ella coma 
Predicted Translation (Spanish):  no me importa lo que tú 
Predicted Translation (English): Would it move you enough
============================================================

Input Sentence: i will have to tell him the truth tomorrow
Actual Translation:  tendré que decirle la verdad mañana 
Predicted Translation (Spanish):  me encargaré de que quieras 
Predicted Translation (English): I&#39;ll take care of what you want
============================================================

Input Sentence: weve never lived here
Actual Translation:  nunca hemos vivido aquí 
Predicted Translation (Spanish):  ya no hemos vivido aquí 
Predicted Translation (English): we have not lived here anymore
============================================================

Input Sentence: tom and mary couldnt move the heavy trunk
Actual Translation:  tom y mary no podían mover el pesado tronco 
Predicted Translation (Spanish):  tom y mary no pudieron defenderse por la salud 
Predicted Translation (English): tom and mary couldn&#39;t defend themselves for health
============================================================

Input Sentence: what subject do you like best
Actual Translation:  ¿cuál es la asignatura que más te gusta 
Predicted Translation (Spanish):  ¿qué es tu cosa que te gusta 
Predicted Translation (English): What is your thing that you like
============================================================

Input Sentence: he repeated his question
Actual Translation:  él repitió su pregunta 
Predicted Translation (Spanish):  él pagó su pregunta 
Predicted Translation (English): he paid his question
============================================================

Input Sentence: my brother bought a used car so it wasnt very expensive
Actual Translation:  mi hermano compró un auto usado así que no era muy caro 
Predicted Translation (Spanish):  mi hermano menor compró un vestido muy inteligente
Predicted Translation (English): my younger brother bought a very smart dress
============================================================

Input Sentence: croatia is a country located in the southeastern part of europe
Actual Translation:  croacia es un país situado en el sudeste de europa 
Predicted Translation (Spanish):  croacia es un país en el país por favor 
Predicted Translation (English): croatia is a country in the country please
============================================================

Input Sentence: he will come if you call him
Actual Translation:  él vendrá si tú le llamas 
Predicted Translation (Spanish):  él vendrá si te va a la mano 
Predicted Translation (English): he will come if it goes to your hand
============================================================

Input Sentence: of course you can take it if you want
Actual Translation:  por supuesto que puedes tomarlo sí quieres 
Predicted Translation (Spanish):  por lo puedes hacer por favor llámeme 
Predicted Translation (English): so can you please call me
============================================================

Input Sentence: tom shouldnt have made mary angry
Actual Translation:  tom no debió haber hecho enfadar a mary 
Predicted Translation (Spanish):  tom no debería haber estado haciendo hasta mary 
Predicted Translation (English): tom shouldn&#39;t have been doing until mary
============================================================

Input Sentence: you look stupid
Actual Translation:  pareces estúpido 
Predicted Translation (Spanish):  pareces un tonto 
Predicted Translation (English): you look like a fool
============================================================

Input Sentence: mary was a tomboy
Actual Translation:  mary era un chicazo 
Predicted Translation (Spanish):  mary es un chicazo 
Predicted Translation (English): Mary is a guy
============================================================

Input Sentence: i think youre the only one who cares
Actual Translation:  yo creo que usted es el único al que le importa 
Predicted Translation (Spanish):  creo que eres la única a la decisión 
Predicted Translation (English): I think you are the only one to decide
============================================================