From d01c0856e15d5ff639a30f71c149addcf146d0af Mon Sep 17 00:00:00 2001 From: saroup Date: Wed, 12 Feb 2020 13:59:21 -0500 Subject: [PATCH 1/4] Adding design doc that explains how we calculate confusion matrix numbers --- docs/ConfusionMatrixCalculations.md | 108 ++++++++++++++++++++++++++ docs/images/accuracy_table_output.PNG | Bin 0 -> 34710 bytes 2 files changed, 108 insertions(+) create mode 100644 docs/ConfusionMatrixCalculations.md create mode 100644 docs/images/accuracy_table_output.PNG diff --git a/docs/ConfusionMatrixCalculations.md b/docs/ConfusionMatrixCalculations.md new file mode 100644 index 0000000..48396c7 --- /dev/null +++ b/docs/ConfusionMatrixCalculations.md @@ -0,0 +1,108 @@ +# NLU.DevOps confusion matrix calculations + +This document describes how NLU.DevOps processes the results returned by NLU provider and describes how the confusion matrix metrics are computed and reported. + +The compare sub-command compares the results returned from NLU provider with the labeled test set. It runs the comparison for 4 kinds of results: +- Intents +- Entities +- Text +- Entity Values + +For each test utterance one or more confusion matrix results can be computed: +- True Positive (TP) +- True Negative (TN) +- False Positive (FP) +- False Negative (FN) + + The cases where more than one confusion matrix result is returned are as follows: +- When actual and expected intents mismatch it returns a FP result for the intent that was matched and a FN result for the intent that was not matched. +- When actual and expected entities mismatch, a FP result is returned for the entity that was matched and a FN result is returned for the entity that was not matched. + +## Confusion matrix output for intent comparison + +|Utterance text | Actual Intent | ExpectedIntent | +|:---------------------: | :------------: |:---------------:| +|who is bob goodermuth? | ContactInfo | ContactInfo| +**True Positive for ContactInfo:** The expected and actual intents are equal + + + +| Utterance text | Actual Intent | ExpectedIntent | +|:---------------------: | :------------: |:---------------:| +|who is bob goodermuth? | Greeting | Greeting| +**True Negative for ContactInfo:** Actual intent is not ContactInfo, and the expected result it not ContactInfo + +### Producing two results +| Utterance text | Actual Intent | ExpectedIntent | +|:---------------------: | :------------: |:---------------:| +|who is bob goodermuth? | Greeting | ContactInfo| +**False Positive** for Greeting and **False Negative** for ContactInfo since intents mismatch + +## Confusion matrix output for entity comparison +When the expected list of entities is not empty, we check that each entity matches by type, value and by the occurence index of the matching text in the utterance. +| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +|:---------------------: | :------------: |:---------------:| +|Schedule meeting with bob tomorrow| 26, 33, datetime, tomorrow
23, 25, personName, bob| 23, 25, personName, bob
26, 33, datetime, tomorrow| +**True Positive** for datetime and personName since they have a corresponding match in the actual entities based on entity type, text match and index match. + +| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +|:---------------------: | :------------: |:---------------:| +|Schedule meeting with bob tomorrow| 26, 33, datetime, tomorrow
23, 25, personName, bob| | + +**False Positive** for datetime and personName since they don’t have a matching entity in the expected entities list. + +| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +|:---------------------: | :------------: |:---------------:| +|Schedule meeting with bob | 17, 24, personName, bob| 22, 24, personName, bob| +**False Positive** for personName since the start position doesn’t match with the expected one. + +### Producing two results + +| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +|:---------------------: | :------------: |:---------------:| +|Schedule meeting with bob | 22, 24, userName, bob| 22, 24, personName, bob| + +**False Positive** for userName since it doesn’t have a matching entity in the expected entities list +**FalseNegative** for personName since it doesn’t have a corresponding match in the actual entities + + +| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +|:---------------------: | :------------: |:---------------:| +|Schedule meeting with bob tomorrow| | 23, 25, personName, bob
26, 33, datetime, tomorrow + +**False Negative** for personName and datetime since they don’t have a corresponding match in the actual entities26, 33, datetime, tomorrow + + +| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +|:---------------------: | :------------: |:---------------:| +|Good Morning| | + +**True Negative:** There are no expected entities and no entities identified + + +## Confusion matrix output for entity value comparison + +Besides comparing entity values for the entities comparison we also compute True positive and False Negative metrics by comparing entity values only. +| Utterance text | Actual entity value | Expected entity value | +|:---------------------: | :------------: |:---------------:| +|Do I have unpaid bills?| "bills",
"invoice",
"invoices"| bills| + + +**True Positive:** for bills since the entity matches by type, startPos and endPos and the expected value has match in the actual entity value subtree. + +**True Negative:** Not calculated + +**False Positive:** Not calculated + +| Utterance text | Actual entity value | Expected entity value | +|:---------------------: | :------------: |:---------------:| +|Do I have unpaid bills?| "bills",
"invoice",
"invoices"| checks| +**False Negative:** for checks since the entity matches by type, startPos, endPos, but the value does not have the expected subtree in the entity resolution value + + +## Reporting F-measure +After computing the confusion matrix numbers, we compute the precision, recall and f1 score for each of intent and entity and the micro-average F-measure for intents and entities + +We print out to the user the following tables that can inform the user on the performance of the model and help them make decisions on which intents or entities to improve. + +![accuracy table](images/accuracy_table_output.PNG) diff --git a/docs/images/accuracy_table_output.PNG b/docs/images/accuracy_table_output.PNG new file mode 100644 index 0000000000000000000000000000000000000000..abd3604db96e83f0cb3517102b442792e923159a GIT binary patch literal 34710 zcmeFZcT|(x)-H|>ww0}-fHb!S6cj`Rqy;-70aQRhN)(VDP(p_!Dk@4bjG;&kkU&&Qgb)IR5JE_}FS^e@XYccU_jiAH?C%?QjC=n82d{5gYpycqdY)&_ zM;9$jHgDLqK}<|+v+4OWm&L@^SOGr`zlZ}@%4*ap!y-7ReLB-uUVz`Rqbuyl;&79yZh*D(LJf>q7ZJo@$wSqpTGQPjhmh#MSKS4k6-jS zWX!=o-PB$$yDVk(r<*WHEs@YxShOq&G_LC&&{VNcjk|&0H?IFGCU(30$S-1IC+ua` ziitfxv}v`&@ht4JNVqWrOZ?6i6N^kb@{8|D&2?haDfb%^gA7f!x4+}d;2@JXtFqct znDylG@P{JNBhfeg<-)wTX&1qIbz))>>!Xx>*e(+{3zbW_g5?*Fm=H$>MmrlF5V=Y$ zc)R#=T;AJoO6DuYCMo}1^lGsMv&Ju&7wBE?<-+I#d17Lsq$3L+aTq1!;@ls=>_pE6qd)gF9| zDdh{RoOJ0p5@xKT?9b-N*f4|i*CMgtgO8hQ<~5WcVz=4bfVPY`b=Ncbfi+oOBv)d^ z%`iK!Z{e?$Nlm(=S3|M(iPrm3bTb>NT$Vm+Ib{?vroS9=+%YlV zyEc02*Ve5gSE`o6kw(J|S1fB%=$KktYanH1gdAHCC+e_aM0v5>=L}`X1$8c}X$uLd zPc1NS6VSf8+o__u0iR4YEoE-e_mL}44M?DIdCq5$$$~#;`~OV(>>s zf}WVzBfy?52lJA_;VPDn>FXoCItM_Flv-+empc=i94;HIVEU_ zy$e|%g?!*Lv6-3i5&q(|(F}r#=d&(?d-D9*^uu}H=_@w(@5G#G0`WK-k@@H}J9*k9 z1gxC}xiD>qsn(UJ5(GXSh%IqLL-kHzZQSv?R$TlH!F7w+iM?5IbMUDTW5clpV-*Ks z^Y0~gSQU(LblwZ#FSk;+Uwwm-kiv&7$Y=|eQLzQx)d}pRTdt-KVz+rfo0|Qay1$eB zW(5~fDtQqloYynJQ?gAT29tz_hw5utnlZ`gh-ZRV(6llka$;e%SY!%dF<&QeQ0V}q z8D5%`IoS&>sXn+mpGb~?=ABl@{%KIb%NjvU#}|-1SY?tY9uE?H6g~=h5$H z!V8rJtxor%$)k}*_v^V841G$-0SgKzx4y0iNnr{~v7epofMH-W1{zr<$@X~Go1%be zT=7_2haje40COBzwDkR(R*Q*U+$p|FOzgnMC}4khdjJ1pulO&$Jt5$@Qz@_$k@b~^ zgSDa6Z=hb(T&NjU9qKR;>_zRwnK{@w$;h9LW_99np((h-f4aqNpz*?gzRCS94gca6 z-8H(%rq?s;Ihsob)6R>#>1RNwy}BQt$W89sWTAZLI^_SyrFzm#Bps{KMsb~6>`ua| zCCQC^+#I}3Tm|*+wFdm=#%SjFQeS(@VU@=bheM2K`Q}nqk(z`Ci^eZ*4*REPmgH+B zYJ7~g(YVe%WNYZdEZxX{U6Q;E^4?S+xyHKNEk=l8p$tXBFpuhwK; zk%@eDgdfXn38c<7T-JQ{*8EiVJTlY3%7LE)`!t%a;vQ2_)6`^bb0)LRd@DBx9gb-Y zSW^YGHDdow*dCfEp*2`FZ{Y~=G!7k}dP0867UIv`O}Fzbf?b%G4><1JPP!k3%@kA1 zn@wX?J6dkn*rHH~OYsoVrus0n)PZ;IgjG)0ZJf_6!4Z0bH;p|}Gj zJ~l9saTyeCy{>0(3Lb=qp3{7$TaEVS)qTlnb20y7D8hHhC1K{8R#Xrw#h}F)9e`& zRJle_MlyYNO+Y!tVt<8JFD|BgeKE*4^6vaqR8L z$Rhpa=oWuwA@FtIAPZS1@Qq&Alni;T=0pWbx}jW3yMoz(`s@D4?EbnPvrd|ok7~zW zF4Y`4-tWBLFdrT6?N@KRg~Vye<$2>^7gi;0dYLk`<9*wsg&PEeQY~Z~V34bt&uX=v zofn)LRLYu3Sx=lq1XAL51PMe6FNlm<3s+^mR{aF@iJiTCT+3oxvz+6pGe^Zn%deJpf{brdwzJmGYR3%(6q@c=rjhuktQmGBj`{5)ea zH83uDQ}^`Ry+{{yU{ay>8qaaJ zE@x7^xv!3shbRZScO`xm@f*;Uw^(YUz(deVnq4A2KaX7dKkN{+j?&bb>!qs zkZDIZDJ7^mLmL<%uYkSz2(ZSd9?MUpEK^re(lFtY`W*C`9BcmBfjkiDk6FI}_`yq2 zw~DI$Vh%P_^CXo~(|3DRzNvAuZRT4)K)+k|GKquJNuYhZ>XYx^i`AGX-V?{dUo%pVvF^jsP!Nfdt z+2+jl_I=_!TknwoV#1@Dx35<$SJHN{Lo^|erM>N+jJ|cHmE`8)oP)la*Z=3i5*ajg zXP{awm@Sw$=1Pvo*Jn_bR^Dn?uHLSZ8Cu*v`0>%e$I%3Z?ZcE3t+C?kz+T{o)F!vH zVh01tp$-fS1{1HO9M3nyxn?_lr>g1>u<s9se8=U@fPu3T(Sp)1}W*0XBpb@M<-f z3M6*%(ou`z4vF5h6L}%h&GHpb$O;rXjCdh8)%?ZXkdk>2NoJj2N}*OSN0%n z*__Fg?D^4qqdLRi$T-k^hbsX~c!{3y=9#>y6^yubmTG135?I6Rn1Xd_|7t%Sb#jDdr=i zTGP_$ke5K36ki>}Xzs)L#FhE<8|r>h2jUHz)%eW$%RwIjw;oeOuCEH!cT6;mt`i>i zl%#u;Q6E2(IJ0d*uQXoGSxgo5Sk6VQyJWPJI}qpRUr%Xp&mvXKK$4-hkcBlP z&9^OEiaKSg_cTcvtdBymlHS?m8|!A)O*}Ua*mLKj4t)BikrCF;JsDZ|e4)C5$-1zxqFx+mAupZfx#*^fS!Cd97ATDms-SSRP`7 zok|&9>`c4oI8!Cx(yKca&CN=EOZmr5D&RiS%4VhLEiB2EH|epV@UMxF<-(LwZRG}^ z6+)Bbrd5B~SC>rKVG&Kdb|vKV98MtY4eVB|>)8pvx!@Mzsnk$X9I~Ulwwy4`hDMZIC=D zV(I1K#cLcpv)6G`9uH@+=?wmY=~vYm`3fvrv6R?FAOvJ)Mjps>N44wuPi#a!qOHjg;5%n&-Y`r_)E!k)A*0^x+BdJzy!MpE<*u(Xr*Y@e!bH*1sYZz~1HLq)OJu&d&(3Ws)Wnx`iWL*S_ z5i)n*_?#cDBu$R}v1rT)U;RZWqTRkIlL21+z7CgpO&(JEIdA(Tyu}f9o)USohW!=2 zr2YxLoj0qN7u|L=-|=%nz4u!`>Xh;Dm>Pd+A;K7A%if4gq=hcQTWJgLG=md(4DowF z@3+a!SDsahtTnL8v@SyV%?J6En898eOJ`MME#?!syJ`XkYUr~h_lgoNWh<8G5R~Gd z)SepYT&=NmGGKT!p)A;0n~jf*%WQOfnt88*wC~8IPb{eKPBOq9HkHJ(?Y=)p+4Nc;MHzmxrKfo; zr)JDUma6uW&yx>Z4H3kQt2TAd;pi41#%x$q-LwfQJ_N8h@F+UA9I$xm+6QnT(x7hl z^V4f})i|A4yTAPl1G@x@KmI5&t*_XKoxxDK$?apwplCl`f~=n`!^TOK~W%S~Pdaebnl03zo#+un?S>w;G)1%h_m$ zztDyRFWl6xA@ZXJnLkX<1?TIEv32oT$DhHMR zIoBH9LU2iYLn^4}t7>_D_aBZ8hlikVF0Gp-T>i+o0W49J+%w(L=JJ{WUa2*dOuK#N z_lYnS#eMdvjS^uyWjrBMFKWs$fm|hw)2cvC9TfYNWD>-{ZZyeWt)!EoCi89-9waELt z?Zo-E#nCtjoAA50fv%<-gyB(ji*Ft|RV0coYHE9A8oxk$_X;+$PmKBwNmw$b_Pu_I zb~)@Y1~YK}pw707ZXhwvB0nuG0w1Z#G9W8ldfF|0Dq@|rD#k&J7~CXPe;DvRPOB;R zt*Lqt>=y+;45|S8;Xq!pgKK8`aOKx8IO0(r&PK`5TBUVNmBE_ujgbC4l33{*Y+Q#y-_8{2nZtsPl)byvY10YHNCaunwfFzlCC@H3VWyZI;@fsiHMO zxM-xY!I@Iw>5H5y%4F#Neg@J>|7etA6~-3B)wvJuu=3AG$3Q9^BksSc+Y>>4H1eSY z=oNJYcJxx~q5RrV{PpWy;BR!~{$-8DT$(4Y()pZLZ}RLOLM&H4+r<7jPAzW%3V`dDG_f z6+dv6MY+oOu+Zo}hS<__I4_UU%;=lg7w{^etgf^FgOpWGgEHUHJ)p`2k!X)qn%5$9 zAeXl?yC>6S&$hlSbJA?`-v~J1q<5`7RfILRyxz!HS;Z2pEZbKTra`>^QEUI(X5ud| zy*%mJ5+Nml{R*9$By0B)Jb(knE#rDaOVA{@bc{w@kgv`>lb3Pxdd}h5u+VSz@usA= zuY0(TF;E^pz3SmdcMbXUOsz=OrcE?t~vH!Z?*u@VcQxS^kJpOZVOWN~d2@2t0oI z``X*bcZiehnKZ3@v^OQJ1>x)rZf%cm@9=&91>v0P5~JI<2V>@H#j7ymQ%Z_ z%+;pwY6A9LyI_&#+1%}X>1k~JQY}JWG!rx>-1ro4GHTK$kr&4z>6K@TLQhVgRb!V^ z!yZGVfq-+*ofIjn*2Nuc@BJu#Tkwc@?-{MfBlLRv=p7P+oe}D)Af;YA1^|&H2_4gc z^DcR%%fw!0KKk3{Axt}V*qoPe8cq>yg!)0?X8ZWZzDG9=AG7Sh z`=jhT$Po#4jIHh^u=a&LBNed)+Vf2Y&<=^yF$agP#YXzIAGBmCL{A6F&6l{RD(MeR zj{Cq|U)k+)0zYd@WO>9%FvQgoSS(Qn2$}AJ=j^UuIVGH3txJ}Hoz3-_ye(bthjl4* z@~ds_*sH|nS%c#p#yWvH?J(Ey_~+zFzhUiX{ApK>dr4Vf`4iVcjm2Kt`4m0pqe{C} zaehf*O`k;$w|xEa(8VQa%;A7FtAQ15zU}yUpH)g%F{JFk$IWp}@GP<9^q&49T7Vh5 zA1soVI)$R>O#{qWIq<30U|;wTGp3Kgw6dg>LnB7bm23C;lHtL!oQZj%^OM0o-N<3WE> z8*}ZqMs??oQQS*R5r?me%=31K!{N2@Js;{6(OKlf&NssSU?Gs6WhZ1o6MFUaPx^_9 zk0Aj!+Fc5DRDf~20>4RGFWo$_n6WL>a?1GTt{>qbA)v`xxVH%DTGQVZ!Z_FP(WNli zfuWBNAVX@=g~rgo8-BC={djN5Ks0RG8{~iqK$)9+p|AkBSxq3q>UEE)nYTZm7X6H`UJ^Gh854?Igc`Q^d z5A*=^QTItZKw$xq;RH-trtajaKrY31TcX%Ir#>dskD0ncLZeX&L1G^uT zRY^-1E_Q&1#XmHVhA{6idi{PqsHx2=Pqr(x*K(agGt%I*qfOS#ibQ{w5dK>GP5Z>2 z-@Ngq!(5uZ--}n{PT)`*1$u|d(t2_U!~jjPN){U6S{T$3w)KvhTG!#Xk$D(d<`(X{ z=b?m3M})dFBb6+?hj_z!yDI(%ze1B*@9*eW0o6e97Gq!`p9zs8_XK+UB4v1M&TOR7 z1FV8dz2yUQFXlo5-d+j30kv3a5#ce4&<;q4Ka--j4(0`D2Gy_Wamrs@z^RBd;VuIK z)L5CI25Y8w`GE$TK1+fs_>`eO?e0$W3?*jhNib+0?x$L5EVos=!Bii9t~3Ua}%G!KtbZ zmLkv-dZyEEOaI{RLfhN~H(%+CM)eMczrwzA1IcyyI(W@uHpvszKvpT$$qZ0sw7XJ(HW15T6bo2P^Cj8NVLySDr2c=;4`ne@lIQTYq8TyML7L@ zc?-tG!ajHvG6O}Km92vp%LQ4)9CBpzlS?8m7!P>UZfE@%2SaH2U~Q{HRgHyr^X4O@ z9WPQStqxQAS=+1JZYt>A@xfK^tFTOdW3E>>4EXhEo=qht30@`|)@XP}(cJ#l7}aiw=mY|~(+1|NW!crls$fTS8j{=_CJF!o0)*L5(&}ixUvF8!IPLIp zYor(7AsbD_)Lvy;8{NS8-}yl9Q&Hh8(8(8v5{L2=7EMn z5`!AneSVT$!dV)9AZ4|b{Ilob<(Yw#lRI*9emQS|JpN?EF|3(H_ByiDd)V?^iqnuk zYeykXpTQ#~;jnA_(+ADo69FkWNn(3jT>za>vLw(zT z1tNW52D++HTNk-gs)3bJmlhG`R9O^jKiR4DW^O_17YSz1(qUuMmMPSJ=o(-;wpP_Z zWw9#Esh&3Pdc&+#rD{uJgj29<;3d?-JVmI0jtjvz2Xq-+bE-fLPngsgW3+@AcB1Pp86 zsq<~c`gCgYldR&z9vvf|376h;G=Ho?s)?6=5zHN|KQ>agxzZGVy*r?bhXjpLYA+jl z)FK-M9KF+;+?l$tVy2XKiXd?^cnGiMp6g0%49cpdZ351Y`+QKj2RQ^QHE?S3dHRPPE>eZc6R+g=c zc|?zy02F9vt#YdGJ!wM(Z6{y;LPD&|gch7-XRGG47HWvmFW8tLoVl}vb9eMX@PtWq zyxdaN&xsJc+hMlKof@)Ye@5yEC4iQ4Of4X~E^v44EJKjColC4z9b+jQF0j?O#V zr+yhfN+j<%F57U%WhQ_KR!%BvP@Zyf7=fK%>1|MPwbv~6*n=iU_E~1lHNtOy^fNIt zM~A5!(3r91X!d}+@>#Rw>ht9w*Ff>=1~qHa?E!k}AH!QAsUvuCv(bsaZ@`zbPw67q zoMy{ziLduWMJ7xhTklrmufUud*!Y6a1Gi<5gY4e4ZA_NyFHmCX_)V{z6(aUee^(Q^ z>Ib{;_QoC7y9<7c>ii|PJoM`VN`j0`l5DiySjYgJgXcQ5`d&!;V-%5o9-Tl3++-IABOQ~ zAI`}w-i3XC8RYz$CjGD5QJuNp;`YTXk=SK3iTq+JTO*KKId&NDKX;q9>ET{wPeKuc zbfPcBf#Ol^dC{p8$4E(FUv`*pn9GruwUU9_f^*j)&IwsIRvQWiz67Y*lrk1i`M7N5 z?Ztg!AQZ>+EHK}so2wc)m*Ey026~c3TX7RFs~-os0y-zo7KWMM9NR5vKow(rW825j`lG$0 z$qtG3r`ct0nT$iWnoWtoJ_LZlDSwGx{>DrH&BCgiPk%;zkY2tFzUZ$Ksvl8k<3!_^ z1Ny#yl~Apytgd$Myt3`TSP`aT@H23BTFy-TO25PV`}o(TaX>-JzZIv>U}{EgMyCE{ zP+3pj0chrx-HsEZ%IY>{H}w@uqo228f&f%9NHb8cm0CJXs7|zqUI<#UPvW2kBfY?fn{Ar8w%+>pyZ%JG!8`#7sq3 zwFe0)WP##7CBEmPi+SX?GFQuK9@Lz{!x&;IJ0POv?28o#qyq%VR{0gpbp_Wv=h-xu z!Xb|!&EN>hki5E%WcX|QE9c;|aRaT|Z)yNp1)bN*kJOD+l+)!4jp!ITV&S`l6iSuU zY_{r43!6(_K3W-;8@7ybx-Rq)wRq@q-DV zFzhQ57YwSMgM$Wj8L$fI{YYPD*SN?TV2^+La!gxyOFjo`(edFNcDQ_WXYCE+CLPk| z&>cVEFlorT3o6#848#+?h9Ey%lMR@&X*Ap$#1p)fRnNN+Z;3uzJ_;OB{Gi1!xu)-> zqz?I`#p)tj*zmAMnGU3aY$ia=g@D?C&uW2^XUp^?q?9?v6I5bVnv zVa3MW7HQLsdvm2VHSn=%L0L4+eCUUb(*6d@qa871b8Xi-LA$_VM0_nUv)GnjaHx96 zRo?pEw)~(=`E5IVp@tFM0@>G#?CKn7QGY!kYwieVP*m`em<_}yXNTw#Faqrd_c^?a zum9;{osnDfLjgtD3pssQ+PL+H^_xgH46dl=RUur9i+X>vzoJ_ER?jowQe9_&97#qv z&uQ36SoH!3V5r-V%l_)D{1p{ju(bYA&90t9)z1Rp zxYURH^~p}1I%S`38-)u@qzwCc-qdEom$vTen&o{-V@)%o8=z?^MI+4TFC67An_rc* z0p&rGW^r($H3ZY1-M)E^^ajLd;Pk|O8whcRFR>G;LF}0Lwpi1q?ig}Oj#6gdMC9^M zWO7otKEsiuroLBUZA4~Bpk{~GVonIfekuv~$VPm!g9_=3w!F{t?&bD^^kx=Xx*edm zws}p_bOzMkd}R)Wo$?8|8Kz!I)xPEX1&Y{%5i~`z_}{%?DrYR_;URweq#wO3JA5$V z+-UQ67w^H*y!+h_?Q4wwIrKT%(i>w!c3Np!c>gfDKq98V;;UR?{gyiVgIwJ4N5{yo zMv&*Ooe>>Mj+0RH^{;j!Z`A6Yap22`CX^Ko%`9wzWXie5V8)gVzWH_axp+5?&TlL@ z%n6C5-97=FnWzmV$G#riOt(K#(=dvdFLtjJ>v2F2>$OnroCnZxGSlX@k63@P1ag{>UNgy{WX&A$z6u?N39p=inprj zbons7rYL$i34X1ikJNAk!yS~qsaqe=QTP)__S=xw;wU0T5SWBq2r3mDD5n?LH1zUr)%jYtQZ{`}Rvj&xW;}2^boI z4LKFK&Mw|?vd|a<0lUR1@(sC-$4v2DJPkH<7 zt*00nrt_3FC`GXJY4*AKjZ1)(t?^c_# zJA1q)nKG9DMs!xrZH*F0Uz~HcGCuX#FbXsl25av4t9q=(e6b{IO%*-i!{vWdgL~bm zlh3)*6#0GG=+8LX@5wlSBl`D0r^K`6Er4VmS(b5x43TvnlVJ zGtMmhN$J-lHAyzT_a{&k;JMOotjOu7dDZW^fd$T1wNn8=77@hM%JE>{!X`n+nJtuI zen2g4u@T53lBpHXqZ(p>tfYga(Sgx`x~v3$!^F8D^#P~S{L|CB>Wp>ErTwk2{Eo?# zC!dD78(alj_%-cezkXX`wr*Aca6nDp{JzVXo|cTZk(YTtN$85=II=Q)BGf1(xd{D< zzeww=L|^ssrK@1@Et}7;tf%u|++|}9hL-gJT=}SIw1X%bEN#n5cB;Mi&K>pA?OV8> zEhE{zF{J(|eTkLHK#kOJ*|Rlr&wsovR2_eul3H*i#NOYRu7T10i3dWTylXuwWi{=R zjyfMvSq2cu%xH-wDJDSFpiCj|dPL4J+h3lU4lPOPm{V!8Zlf@osWAneOP;l5@b<_d zW!*1lfvp?h%#zFApf_Y8OR-rQiP1A_DBoQs2%z>mBeymR_0HQ)cwiGP(hj}((RBYo z+cfkFt+fwB1kVj+(uF*F=qz6v4)?1mLhnN3i;v>i8SM_45-(JK=gI6Yf(QLv>H5Lv zociVl?kP(H(N=)IbGzx*p3x7aqX73ir2XPcKD=2kcUL`q@f9Qj;faF{lNgXWE@39w zrO;4!gSjG=*UlJhi~50wcx>{kY`pzs)25Wd*QWGrE4TRm2eM7wU96ppu#0;8lsX-Q zzKl?QrxpRggM+^EI_~sr*puc@%xzg+QdUs_VLx;U9G_LOPdR~wdE8!|DZvj~*u#<{3Ed@#3H zcLSQ68qj$E$vpZ<2>ZA-s-DO&Nei-vHaZ^H@HONw#nu_q_6gR3w7_@Db^-(=k>npd>>+9?R{00c9{F8=(gu49U6R^Xl}PknSZ z*$mX@z30<5V0N#_D6a4`vZSjZ>PMiFei|`GLu$NzjlTlMqqoGL=JAfu;kJc&=(}Y= zpuV!n#HBV_A65{naK`;cXw(|HG1y$03IBFKy2N6S@X^UE(ryk^@8B+djW}7%4l62u zo+cyj>MtENj0+*vxHf{q$$h2hZ@T~3b~eToTorO(=6m1v#ZDVKLpEQ`x8nI&vWINI zM#$S5aHE!ro|WO?tv8q#L*1j4916nloIe8Eygo>D+!Yd6eSM;(=n}|SRA8y zDwIY~)W8XI0l5hMoPnv+GlomP>2CzOc}|u7i9L?l%WFSDQ`ay=L-Kor30XtOkf8$% z!(VOzBB^Z)wy_1HOr7O6>xxNT`kh_)_?Uu>lEp7hKo(pfihEe&sxrLz207A?_}0#7 zef&=-M_9;AhYgn^azM_`%x#P&)~%)QT+91+-CytibO>fmO>>FCt^f>bchbLk-yhYX zPCskvj`W!tPP#>Xh5poEC>dNDI#k+*3sE;;v?IGSp1N(bgTVc{#yUG(6*^uQ0%yD;JmdzH7E_pfysCdvOBlSPdPplJu^nE?S@QZl zJHvT>F*@8BK%8EiD;YYsbB|64A%_{k8A-$wJFJRtqyybS(E}2?PWZ4;zYfDqpci5!?fQ9QntX zH5YkW%9V@#%oL9&UF`?sx0b-P7P#he3q^`VubbF>k@Q^m!kf0%+f>_Kx6PO` z0EN?fbfe1i7Sc;S9bo5c7S;5=F$oK9-@Ml4iiF!SuFhAl` z=+w6Ox7X%|>x^{uLu#DcvY`ktaIACjFzP~Ew%XY8@z z8imc`q}_o9F)!!XnWhWm@O^9O74X{0Jm^ury6;bLN9cbjFKM2DGV{Brh>ySj=-gMC z!32Nd{q+QcGXQnHpH!4TXEZV2zF-g=54g6UswDMH<0{N{QoUbLcJnxWUPyS76|I5a zSjOe&IQwx%w-I6^_M7`XQxBduYRha7BE`?N^R9qr(g6nov~a=2d%+u0u~{Rk;U+cs zRKHSARg-l$sGjD9l|9DUXH;;l1E%rf_U1ipHRJPO@9W=i5dmXHKD9{~0sA#A~A$vg&!Y1dghe38o_daSXPMG|E+lc&`P zPlmQm?Beq2S?)$^r-uXWz=Ji2>TB7g!Mh_T;{xmfI=tR`kg|9ra0E zSd(>mt57G`P!=CQD(p92->s6_pmfDCFtTgDYo0l9K`>vk;5P{Ayrr*Fp2SqPAc zU7DWHW7$CWDWEFVNzh-PT_e7MtKd;0(-rK+rk@TmEJiPW&x$&_HlS zE8(r0Hn3p#9xTjuJl{;!WMsT~e6=c}71xH2ek0)2GD!ajiGgg;;N3pG{+9r;?5_(Q zE~?>&roy&;^nL#pDqm)d$%5#2WT3qpNM3+(4?*jbY8Rio#DoAuI3P|dhgfLcXh}Vs zdcuDYx%?c!Uo?lkU~LBz3Bps}^R%yC4TRD`47=9u^;y3YyMJ(@z{pB&Sxq7p8>jk< zx-JhE@=k1I%*r`DOba#78TCl;R19_oQ z=Cg$H>ic!|X8~oYepz9)^Bfhd&=K|uWeX+7X`(h{|c_9H|=xATVN z;MK-(!4>tqI$dKjtZt~;6`ttNu2X0w4`^@8vIi%}6TF`L$M zb9j1N+3?r1$}%$PtZHGcAN_FLmR7b5m*Se~qN-rX(dh++0@t3>h@~oMb`TmFsiwta4*YJ5l36{LU{FK)JK_uS06zWJ zEZ@1QOuZvcY3f2ecaZQzN-@Y{4wP@Rd?eDqBg}}AbKg3&F0B}Z%p+$c<=1lN#Tg&} z(c~-ZX-ZGaUvkfHOv<93;LG1TYMWX`&q2+U)1Ji?T)0Jr^y(c3i*ItYNr!-KlJ3B} zcRLB$H2Ly($*7kv>WuS_r0HBoTxcHOjN<^(H^0CQ zk25?Y%{P`^AKY{EL&n#DwuJF0#(nD9Pj6XixSt1Jkl&sZheQ!2@X9h0CgJuRUDyNR z*e^eJMn13Zjsh=G&oOscRZ?xTJl_er0Ta?BlDoob>n2K;p-K|;yzX%4lT0PdM7;!c z$lSPYN0T+60^kDysu3~aSTqx19KQPa=%vzp_sP$;c?*wdzto>7aGv56bw=s>M{5pRtHJ_%lua!$8B}-fQe!$=N?;plw5+hgEvdIUOAPH)acLB7Ez} z`FPvJ4)^@uo?0O9mAdTkZ+H?Ik1jM04X;s@x}5&!|NNgrJbyp$_eGRQr$~Ozo5IMCI z7l3L$CoajzkKZ)vy?^$oW@2QQoA741FB_toH?jJ)?TK6|pw7_|Ga308>Nx}J`J(`j zDLGiLq!>~xWo1yfpBg(7Mw{P4*oR{wl} z(cn}IP3aQYm|=r#E{X^#NgC>xb$$J@uy=UI(+#f6&Ro}qiQKmps0@^F#I9HB5X{(L z>s!T@pzGV;H~>D6>A1tkTyx=@y{}QRBRDr?&YN9bl;VqRKeI~gox#0^m$=;zH)LHO zG+EnKeG#ZSz!=d22^-xqfm%IDhKpB_T}~CW#z9Ti~5PizaOEbtJk>Nz>XKS9Sz!V9mmnIewnqd*eGfqwNK+UAu zBX)0c(BABpr&Heh?^`m|JM>p_$Vx_=y3Zxu4Ll!=1Qz4|>#sdcHGAGDEPRVuNzU)+ zbeWhLCS`JK;e6sSTXEf5vD3-66(@IMH)4*jSF(*8*?{?a_toJ!L=F1Fl>TV_ zE+?Rzli0?44JTRIi0w$?frB5)S!plxkhZyoeB)y*Z-4c?4cLQ4y{;%Fmcpe2@p?Of znPVX%pFEUVNJCT<*e=K`LP&#{SNQvOTs)AosnA7FjoFZwhc2M=?4^2GSu^=ODXUCY zT}CPKQKBA>Y5ZE<=X?x#PPfuKlistqQ0*%wHvOdW%W<3Q9f-Y-LrkZ{x%3eiAqkle zonEXabGrd;q-vh(xD!Eo&3#f{lXY%=jWpNad>sp@SatIH&Aabe>O_7g<_*Lzv8)F| z%JdJ}f;_K7ANx9AIW#^yaln2fm*B_g06w zQc$(BfhB~C+`3i(k&tElk{MJ=rtYtEHATllIt_q|5woM^sj^K^S4uOf z+;3>tW)v>R%%{zz`7Zw!b5QIyWUIJwipwWALDxZ`h!*0k%#tGzFRY-}Q~-T`FRV`l z^t1sH*lIIj-yqBX(fa~U+Q|;LxAu&>Mx8@fr(h0(8CxRfEyaA6HZ*1w1&1rz4Sfxb zkT0*0R8L{@LuH)X!PM|n4I(dZ5YkUPoj)>&rj$OhwM5~i#cn^l*U(=17QGl^gh`;l z^k*hRcRENS%h83$$dgb@lWL_l;&--|QHm4R%FE#By2cmY5ZZjBM{Amaq2AY{KtU~S z|ATis(pxtcz;Wyny1%=7GHplB%^OQ#fU*o{9;i5nN0?4q4+DPb_oyeLpvf zyMMX#gr~|kE8ghs-Nx!RtEf}VdA7z`$IEUx&@5IA@zv+~`uAVM4Q*KO4O~`=YgNp8 z&(Ev>Do?GE+@&ifHuPK8$2qu*rp1U06)eW2b7yxFKx_vLML5iIWIFh)Ms z;?gmf%(T^eLg}&s?i5h*ssbN!r>#t3_T;d**t-az@7e*U0J1t6j9$L1X>-Uht-kj9 zo*`cd;uK9P+)Gp2J99&#a`IHSln^MdH9x;v4B$)_2<~8OHsUq<0*&cCa%GtA{DgcO47W6?tmfq~dGE#ESQALX;A=b1{eVRKJ#I3BH*pUT!ry zXI_0PfT+n^x0COW7gRS{Ll;-R5F=W9X#OMyj@G~_dQm7{Q_V+Ye#p6qc^^%8{Uu=$>SOva}xM|m>Z zm{sBL&*<|bJDAz=?+G1~?=`7!W@a|;0F2^JZN3OzB?i&>+0e?^0utHsP^xdcyu9fb-cHr>j zTF3TQ>Eqqx3%v~Y4?57m$P`}0fE-U0@VJKma;e zx@3a-o_$iU`L`m6SYxUE#@qL(hR zs|G*YKAx-h{@+V2zE*bK@<*Kr zrM;Uh#%`~6I6C%6Rf!o*00r4{;Pw9h>^4rid^=CL_$|`t9MqTmK2}3qL`Yz?Jj{YC6E7#hpdEiOw`h7 zpwStmQm4am2a^y;n7)CNg%z3#b6HrY6lQMPcT7Swn}vcAi*o0$P9@;aB!YTh=ysb{wc3-j<+bI3f+0n#95TeoX9Sr2{V+x)4fSJZC+0>Uy7fK>Y24RqDL z=dY?q7DAgd{;2B39)40sItsib-KDTT14tgWP?wKnt+ms9M)9SOhNp$z`hKyp){w)v zOfNGgXjR^64fy??)h-(l{W`7|FVNxtqrEc?OET>neltzks?*qFW9m$cv}(BK-l~~f zW~JsriD@oii94c|IZftLn)^bfxhI(kB7&oh5+RxkqJWaApdykYAmZ{~HqAV9-_QL# zAMW>fp5y)Se)Kpluj}MI|L1xB*Wdp}Z=qM$Z|=ZbYFc{iE}b_EmIZ)(0mN8;%f(zn zu?LN{YxHRrv)KpV^a#kZkmRE#u1fN@nkhV((k0f)MW2Xal-ozoU7Xb8vz_~XIYWhk zZl7v#MtEH+nkEP8ayRX8%71WrS-*qCnZ;aX8!62a-%d;3x%rybxtFhhU)SNU1`pqL zS5iVQDhX|SH)|fYDl{2QJO_uCz!L{qZPJF&s`9vT6H7BA*XJf4%dH)3Fot1ML=tV%N3fNO*+XFiKz4@OdpT zJytuMvc-M-K;%3nL?oRt(q9)na7kI5YjA4B4Y<_` z;EVVI1_VVXe^Ue8<=vSq`%S-SI4~u@uL|h%Y0JKBdtGR|Pm6MCHp**0vC;NeD&;Jb z+Yi6d;IB#hJLt=)wW8UWli|v7RjF5&>RrPNTzVl>&*t4;V?0|sLfbb6^l74jq@DKZ zjWZ}rUS+1+HouanY@&5cJv&;%&X&R>0VK4(zj2+uj`cO8on_`{83(V~9DnzETfkbH z$|K%koYAd#SgY+1LkJkKgwix0v*+v2`ZgqJ?x)oC^e1mwtjVtYPH3Wcpup>Wgg!Ko z`lH+viq$t#96-D^=Eds=rgg}?%AJn=B43CLe(l=@zLHuPz?FXIHfqUqn6cb`g)$Am zusXg9m2apd7_1-NOjCN2wpNh0bM;K+XtHrN40+*pEg(W$TayNHviRY1-3*cPqnp7d z<*{6i?jw-1**D%t(6f(9r)=%?gYkCiTOpZS?ee2tc@3`saMrW&foJEZbiL3|(#+D@ zQ*4rXmiP}ifRn^>ex5bI_wUQque9B%fBikhcO0CZA_!G_Rs3MC)Y`6sun^sdzW&Kz z7Q`11yNi`ntuRM^#Is+iZ2leNWc#AaY8Pa(AT5P_S`In>7^uYU4&=PMTXq@{L}d8=OaWhEMI1l;N2okecHXqq>Y+2WOl*qRC5;8AQ(T~+RDD= zzOqEH8T`fQ;xjg{ULr@mQw{M@z43**kqG#I`8|Br6`id1l|x35_Z>RCe0!xu^ljio z%ugsANuJ3Q1dtzY)6}WK4PA=at)W6-gVV5F;(XfamWx(3FFTj|4uI|_3k3?dHTW;G zu4u9xe9qg#{_z)$7iDkyUi_QcyllBA59acQ?G3mD2lMs8_mqIsyT%zz&yOB#y!Mi? za7sk5>y&_$IsRT3mw$r!rpj&*BtqMPnC`MA_nzqwxjmPcJ%}BMO{%rppu@NW-+N-k z)hX&!6b*2RITI|N3kE=-w6J^xMA&D(RF!q*>7PiqqW}mJi}KY*nt`jry8<29cQTty z&5Smy9NS{+26ymne0lVIb31TR0cEWI`kEmar) zbwwciO#uB$a4AjDgUUR{BOIx zc8%VwcVAwVXNH=$3jDvqS^rU!dkG1kvhFoG?!Ec1stmfY6+Ica|J~U3s(^o-wEteG z`vWNUMfD@7eHfNu*&^_q>gU?OWst3`=4;f$GO9Ra-O}3VzlXV3WFW9+breuo!0Q&z z9WIy$%=!p32mnxFuAMW^MV~IO9t{Bi*J2$FExHTBSp5@2TU^KYyj=e%q)xqDTIXJm zX}l=sAN};Hd&niJHj``)XKF%)F8P9+Lx*^Lk<5pjmxm{ITrUhJeMzOHQTpfozyF;dJDTRgVRM~Sp(F*eQ>bJ+@<5x-LONty)E*{ z8y<2Xq~!?w(C*dqb^gI_gTn{uqH~N4=ZCi!0}Z%)LF}k#Kg?5&Dl+fb#bv>pnKgO5 zU~RfiVWTgd9||dn6?@Ekn>Gw=i>M5mLgiv{>uq1#iR)OlH^{a}Q`g@_w73G_9$5Ld zR{vns0Fbpa#hD=N7pOgvTC5jsgbuJIh#I>Savn!Br5X*?+^y_M!mbbVj~m@UWhSS> zZ?7e>0Dbbp_dpG+{p{s}i${z)IqLV#e-Y#?ZnrZJ3ZECUlp(h?1~nW6!4uUo83~Br zV>);q!81uLmTeaY-3>vj?gJp%1UtN!Ofr~oAY%nk5_MdOaw)YvA6WY61ciW)gg_l` z;(>306n$MH|1`vfi%kt0%;+oZ0lyY#s0UFCF{wB(* zD~kg-S(B9I)#j%e+@w?AA>u&{Sp@sp4(a9%xqixwxrvZt2dK7TeyFc+t|8v%PtPvo zIirRQqG0$b^V~a&vVNQu%ksokdG0~T+>?7Q5yL7v z{FcwbD{<(wyn})oTw*mVDi(h90x(Co)$?PU+}zc^H7BbpCr)i+`UguR^6FeO^+76l zWt9J?nH+KD?@*taOI)$=eTb&AIOsci_>x9!Y-9@iUM1j2UHR5DY2{mQt(Ry98VBam z*)A{(Ln!aJ2+=R#umB*d-x5d*H+uS8#-IE%P+z$H(fV@Aus|Bc4Bbgs6g!_cd7~kSgCk1K=iIq+ZSIm6ya!-sp_@C$}@b0L5Mo6~(hUKHA63|!i^yh3f-HoiChR7hMH~E@pt%=o0)?P!Eul6+4r_Bhf5z}$lcXrQyifCDOs1dpu z!x$l#T4tHyKWA~CLf-KNh~O3hs(Q-fQ7T8B-#W#sO(B0?n){A5vUbm(zck56;Qg{d zrO3I908up!d4bLT)sz#^3)K3c{WJmsI3N zZA6PNy1*Lj8*vt*@ffPlbq^qJk|(!&9~cxDS_W799*9S_CdZnOW+k@44K|6if_kNy~c z%WVJ0WVwG^p-|i(%bbqx;OzJkIw0u&vkthwdc|op(mEQqQMsiD4Q(6xleaefWy60E zh8)`!)binlIW}pcu>;5tzPXMSGMn&(>UTQV;X@_PYr+X;&ixKG6l?@>Aj5*^Z<-Ed z#`b+@X-0uPxvcj&w@+Ks4_|a(D$|DtN=W9 z!4hsNMG)HL?$9>#g2==iQ8Z;VYNB?zs2VO}{{&=Yb8pAwz7_Z6ni8Ybyge5mn>e9p zKUHMSUvx`0ItA*Ozo%B~)-n`gwqxFF`ZVzx%OU6L!kNa$9hbm}a!n#}bThBHI_qJ0 zX0v>a??y2cX$_RKu8x8BVH3#`I?j z)%bsI=f&NW(-R%088~dx+LK(CPn-4B2cS%-ppDUYvPhb0yH$MnLoK>gXSX(%O%I%|)JP#Na zou{#uk&S|-^IfUnnR}q6*0M%uP|MLT!d-vog$3htGz8fHgbQpZkAYS;E*{LQ3(BX z^_U4J3-(1PdKPPdUSA1dU3DltN?Hbynej%*uevx%5E&i4<~4}lGourO!|VHV1hwon zF*&$$DdQr-;Zn zTx_7u$G+QaxuV-6juVu*uO>Re8n7^l{|Vi9YxA*@oHkB(C|?+d&#%z6?x;k(_8d@t z`d9PTAva4&a6OCB<@`l_(^eo5BV5X^_m(5B%U6C>99lj+{!R@6LQA* zv;e4iIJ++Zzj47GZmLHuf09CPgX;M-1xXE>zghrHD7k$-S^WG&wQ0S;jzxnp;`!&==IHaUEyO;FrN{kyROI6zc)@Ycd z$rA=iM5)zc+IeSnI*$~bYHg_6dEY88Wo275K2cpfdC)WEzM)287<$vTv+|_Sui7it z=;oIRzYlS#0UNdplif|A6arR&4#oSTYPO!R2~J+wV}*#S7g z??5O?*6|kwG;(Rb@RaB_m_MzVJF<4=q6UWXR++m1>kmsk8laZju2c9~hUzR$<^ zVV}0Nw4}mvbRrV3*b_+OSaVT%&E^hiu8S>=aNS~-84sKnfNS(E8Aqx3bSCB8Ol09g zI;UjI8k?}_z6zNC@A41R1F~Hz=~n4JhqVBR4U+z4QXZ)eYY1!_#44MGDFhU`9u^$8 z<&SkS@C#DY%m3Yi^wSVlF}<`~kMFXPG${QXy`aVXr8AB1jN`?SSZ#~JLkw+n&aO#q+?A5qT!Gwbo!<7!oU(1sPc|aXiGbII^kejV_+{08|4kvK>%S z>vuwL^2H;hFPyEFOu_~TEP7$ts#}3FaA28VSPMl~z0bou#0<~Dn8epob#aPgfpK{3 z582Joy!mrhsQ3|aMN1HM(!(xNTz8?1B?)GQL9Q#jZZ@FvRTr=FKBx54j@UB8_&c0B zLSsnSpef^U+WSL3togk&lW!NHCO&QWY?;eNI6jo45tZV>G#Gv_2W4|cSfWTrM91Pv01E-`J$;~x`|#mn{{XHfz1Wm&9sQdNE)@P~rLKT`J9+>2bB@8ojK7id zzc{)Q4oLPH#w7iXn!jRe040f#|GS6(zcukcYfb#uk?Fq-I>6aErV2!Z(jE$m2f`R<93DO;@>%n>8cj(q9SYwfc6Xn=p)nyM;1RJ=V%2v`i# z4W&=KCVanG0EQ9v#abzcG=~Nb09Yx%8ZhX9zZS1t$p9pld*pJ10;G+_%e<*`+8+YD zfgHm{!TH6P`1&YAF@M+2A_#Q-HB?D+B zxK~)c86jMWjsb)I*xy+}=^2j}zQUyKT=`%w7-UrKn}LX)rG54$p6*&^94S5rwM3q& z1FB@*;5yf`F4*dmO9r)R$gHD0z%K_HY1Ipc!r@O^>sLCvnN=_`F--wQUr%GVa7p0M z`T7mhTTL~Pgl~h*FXbCXpbmCR!p`iDLJ6*42x**nOd<=)&z21!&PQguY@SSuLh;F= z<)P79{G|fAG5Yy8A7wGp#qhK0V!y%0D;Uk~FQwgis5N_FfGg4{+YsX_dz>7;&1p5% zdw|C`Xa3F&gf!B!7UzaZ=4kXtt9LNeXJvHpEY_|xx^|dd4UL`q^`-zIh&2U9@yWN4 z3;X;6o+t3Y`r+no@aKN#-GaGRK=1?$E-Y|IK-He>0>+4S78s&unF76D{ zx)yV_NSb1Vcb`HX!(UkfPugq@ZB05YIy>`bX>gq7BysJASZ#H}lU{Zx&Cf??sMb)D z%J5*l-XrWmA7Z&tmuPrX$EX>r*MVcJURZ+s>KFx#82+jU7n*uPZr7uv$<~g{ThALg zZ!|2Uvv2D0pJ;sFT(S2=^-Mk5YBDX6OEix2T8J5%1`5?)$TwICuGi#*f!I)CD=lisAZeixwk$Bsi8%WNgym z;sp6`_aZ+S@@m*FmT+525*RwrEU@Dl;m|2#sIp#IV$8xKEfVNsakU6=OB*QY307rW z&;8oGPdd6zYUO*^z-HsUE&Q!fSpH_60hC7{h_bh|RJ0_J1BLzGfOWR`P#dQVy*d42 zc43k5hHv2I?gxP6mE~Dp{x`ay&8A}kE0dsUs5#R5t{UDcF`({GA_|m`3G;YBTZzOK zrW5yemD5UlOXE)W6Qyg^ZoWraakKBRhl(@{LCtCi0PB~8pdGx7oI3VfyxsrY@d zY4hd%=xo(o+V#Ds~A3ljkC>b11bNK3Clu+?yU_2UyAfRQWoTxX(9(SZy#`N2&CyO;MKL` z92r4$F9hES^bggolA3My zV*$2LQ>WkWHImfG4jZloRuB&kt#o@$+>=q;T4&U3fNxp{ScWeZZ`9(+4@-}Drar!WP*13!3P%rCY6RGQ^)Vs-iF>F)OQ zZLrwx@m`=LuhC*wp>jrYxn&H21Zs)=znq4Embo7Q8*DGOKRSr zh3rx@()tq~{76-oTwk(}l^fvJ0m5Ys2j;7K16zJ2?(m<&-fOVl|Lm>&cOUzIkBEX5 z2dpF~9!vOc>R)vrWrn3QXAJ&#na)@3{@o?GzS&OQV+C^(9`T>!;sG~i&VpN3{ZqoD zWNCF&U4a%|&Y-FUyjUMu1G_^V4F;0GH4LUXL69T#N&XA|Ulix|iAL=&AZOO(_)3wA zc+u(1NISnlX1(`a8BH`E0nhN@iN?QgD82?7jWT!MTYUR7?`&zFu}8`t#01cCtBz*pYuk1pDcz=(5=JDza8w=* z5tj#SSh>=e?{=q`^`-Mqtmn0NINj%07krfBFuZO}1QmxwA~7O4Ij{GF%6W7|VHh7k zh~$rM*IY;U9&_Kd`dr$>XV_Z*hC||K^&hkv?t4wp89dR)qVS(eVzSu#M_PdpUFR_) z(+=kD6(~yrU}Pg#5(OUDePuki&L>!gciog_-JPIsm;0A300KqXUZWs zI;_hRJ0R~v<{PRaQqw~7odopv@S&wa5<0Igw*m)*HM^9wc=j4~9(Pbch`g>ARXPP4 zfR&R8&m^Hg(@ESVb*RvL@im~U7)#;M;GavDF3EBhl1%W9?~-`N4(3Wkzr0GG3N7!| z#l)9D^Sv)gQkd1uy|b#BtUxzxO|kR-K~5x>=><8W-+&S2VACD6{P);kk=jTMqGkcd zMRehGseTUS!rXOOya##~t(@5GfhH}NQ9sGCZ<^@c(+a6BWnuzI{eA=|r{GUVxmSMN7*1u@`*pjGf`T zEkf}Dxeb_!Lx)S|T|Z^>7wR_lc)Hf(h>$xQJRt{!e0V2Q?f3e$IyhmvSfh|u+txeY91pS%E15w{tU#=e>eh>@p4$dx-Eh?msL zL63=HP-AV6N|SD*Ybv~zbpguRdY>8wz1S9#+o z`mvh2_f&9x%v3PL0H0N^`KcrdgPi7kN5Wz@SsIp`HeChK1SUn$>Mi;J;`m> zMSAmYCPxaTSy)m=X8K9Y#tueEflO;s8bMIlDnXRIi*lVx<~Ka|k;R9)m-qO7NgTkP zu1m=*D9YWagqQf;85m4|iNlTGzxH@3P&!C?i<^&+ju9WzU}rF5nYTeJ0tpx+@wZ<0 zJHNQD$>W7{#oO^s`+>V25QnjEpXpAwxjXt9I~dB|uD5#{>A{q3KO7D#l+xF=pt=G( zW}CTURNBe%SIbtL;YJUW9dSn=FjFY#YNKBN%v=xNWDyb*>IozY;*yN;C^qq->yGE- z@X3elX{tH{oC<}wGbTc_O=|sy20AO9QBA|sV}7Z=IBe*aUil-Va3DQHiRO}Do{@$G z+QT@uC5^DzFzLeCro8E5Q=n~Z;jy*29YD35#m~TX%=kvl?WZ>sNX|!)+WyGKV!>}8 zw}id|pw%p&mWw6FD7Fnc#zME)nDJ@zQ}P&VaP6eZsQt+YTUxH)inZ)|5HKsvWnU$A z8)GV6mss;{{Fn*PXD{bv!)|%KoE`DdH&3(t5vZ~6(4O0--p6pkhGpx$MN{`t;pVvJ z^&;9LqQEg10y5ly2m7$I>#E)kdm9@XIS=2ghu>$BY#K*h!D8h<&Z6oy&8dteHB+Wy zpZ#%$XXCzN?P2hX?kG*=W2RbR{)V*m z%8V=F17TNq>H5@IH?+01yV`gOW--RecSSYQVbdCu>Mjo7bQmz3|OWlK)yU_BeoXWf@@n=>4$m1QwcU_;>9$SQFC9A$V6X`h&Y-c zS#+TSLBffABe$-@v9{(NJzTm|`opF(cdStX@6K226=t0Sts4yuMC^%PZm;`5P1<6V z=Gl20LFH(W3y&b8q|~{r18jKUG2baQ>{GYvzYKeO=;^2SX%`E zYhycGF6umc|FWY_u%XBq8z+uHalB_P{diVk5a-{$NtmdF$bZ67vW|`6-IB+V^wCzTVD7@fFC&_M zb5ko|uu;b+Z`Y6!=Y0Xx`L#9t=t|ZbX0>t2@pp>fW;Ye}W7$R3B)U98n8`ToGWBpo zHnfml*g%(Dn%^Z#d=TPi_6e0g#aX!B*AR5Q>v-ds%1igGM03}@a_?X?D|5auZi`$;H7zH3D^~Rz|UY~#Oy28w?emYCR7a8xc_^fCC zP_A4VFC*W`Z!T&?wP5a|ZYWIF+ldE>nA>KiB2ip#q@7!m(Rj>Mj0%C-NKvT0pAh$A z5!X2>jRvRNfA5(RYuzQ~ONC~ikfj*|?q?qa7vk7&&|xIC!;Jf}eSmR-vOkL`kG}KK znUchO!X9ha59U?NhDR#6(S%!CL4Wwe1);~uJJ!`o{H^U{zddiGkX6a@#j;_l`^nr}Zj0ef#ORp7SSte@IP3@82X@ZXOr~AilCZw#U1&#&3e- z0La>2!x_AzsRI2qqUQRrw~Xzaqf`~})s$g;1N$LrHdw>?a8y2S2}~Kw4e;-$G>P2Y zqQbxP>dusV4ZReV$tOZle8I~N2~m$C?6Ggxi$n<*X2jeM$!ZpBlFvT9fM<6Fq>kdK zRRbx`F)~DQXH>lPN<*7xMbBzxMy)TQ>NHO=T5|2>{K=Ls4 zQZHfN_lNp$I24b%BrOc*{{H1Bz`RLqseyr_76RqQ zTkE<~(X+KL6@33sV&Iqhot3n%o&TKV`|mymlzi>q_&>~i0&0<;#Q%Z9`ndmo_vpNk z(^x42!=1bTC5CG8irb*(rsUms{uK%mXlnCfrJ(EIwbmYqIpWxz!}P<|3aj|Qz*bqG Mu|HjP^6K6H1JkcDtpET3 literal 0 HcmV?d00001 From 04f6b128900a7f522d2f5ca85b8bbc041686883a Mon Sep 17 00:00:00 2001 From: saroup Date: Thu, 13 Feb 2020 08:39:22 -0500 Subject: [PATCH 2/4] adjusting spaces --- docs/ConfusionMatrixCalculations.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/ConfusionMatrixCalculations.md b/docs/ConfusionMatrixCalculations.md index 48396c7..b76953e 100644 --- a/docs/ConfusionMatrixCalculations.md +++ b/docs/ConfusionMatrixCalculations.md @@ -23,19 +23,21 @@ For each test utterance one or more confusion matrix results can be computed: |Utterance text | Actual Intent | ExpectedIntent | |:---------------------: | :------------: |:---------------:| |who is bob goodermuth? | ContactInfo | ContactInfo| -**True Positive for ContactInfo:** The expected and actual intents are equal +**True Positive for ContactInfo:** The expected and actual intents are equal | Utterance text | Actual Intent | ExpectedIntent | |:---------------------: | :------------: |:---------------:| |who is bob goodermuth? | Greeting | Greeting| + **True Negative for ContactInfo:** Actual intent is not ContactInfo, and the expected result it not ContactInfo ### Producing two results | Utterance text | Actual Intent | ExpectedIntent | |:---------------------: | :------------: |:---------------:| |who is bob goodermuth? | Greeting | ContactInfo| + **False Positive** for Greeting and **False Negative** for ContactInfo since intents mismatch ## Confusion matrix output for entity comparison @@ -43,6 +45,7 @@ When the expected list of entities is not empty, we check that each entity match | Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Schedule meeting with bob tomorrow| 26, 33, datetime, tomorrow
23, 25, personName, bob| 23, 25, personName, bob
26, 33, datetime, tomorrow| + **True Positive** for datetime and personName since they have a corresponding match in the actual entities based on entity type, text match and index match. | Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | @@ -97,6 +100,7 @@ Besides comparing entity values for the entities comparison we also compute True | Utterance text | Actual entity value | Expected entity value | |:---------------------: | :------------: |:---------------:| |Do I have unpaid bills?| "bills",
"invoice",
"invoices"| checks| + **False Negative:** for checks since the entity matches by type, startPos, endPos, but the value does not have the expected subtree in the entity resolution value From 10d3c1f468bf83c70ab427ec44fdee63c12a5f64 Mon Sep 17 00:00:00 2001 From: saroup Date: Thu, 13 Feb 2020 08:51:24 -0500 Subject: [PATCH 3/4] adding text confusion matrix calculations --- docs/ConfusionMatrixCalculations.md | 40 ++++++++++++++++++++++++----- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/docs/ConfusionMatrixCalculations.md b/docs/ConfusionMatrixCalculations.md index b76953e..92389c9 100644 --- a/docs/ConfusionMatrixCalculations.md +++ b/docs/ConfusionMatrixCalculations.md @@ -42,26 +42,26 @@ For each test utterance one or more confusion matrix results can be computed: ## Confusion matrix output for entity comparison When the expected list of entities is not empty, we check that each entity matches by type, value and by the occurence index of the matching text in the utterance. -| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +| Utterance text | Actual entities
(startPos, endPos, type, value) | Expected entities
(startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Schedule meeting with bob tomorrow| 26, 33, datetime, tomorrow
23, 25, personName, bob| 23, 25, personName, bob
26, 33, datetime, tomorrow| **True Positive** for datetime and personName since they have a corresponding match in the actual entities based on entity type, text match and index match. -| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +| Utterance text | Actual entities
(startPos, endPos, type, value) | Expected entities
(startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Schedule meeting with bob tomorrow| 26, 33, datetime, tomorrow
23, 25, personName, bob| | **False Positive** for datetime and personName since they don’t have a matching entity in the expected entities list. -| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +| Utterance text | Actual entities
(startPos, endPos, type, value) | Expected entities
(startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Schedule meeting with bob | 17, 24, personName, bob| 22, 24, personName, bob| **False Positive** for personName since the start position doesn’t match with the expected one. ### Producing two results -| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +| Utterance text | Actual entities
(startPos, endPos, type, value) | Expected entities
(startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Schedule meeting with bob | 22, 24, userName, bob| 22, 24, personName, bob| @@ -69,19 +69,47 @@ When the expected list of entities is not empty, we check that each entity match **FalseNegative** for personName since it doesn’t have a corresponding match in the actual entities -| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +| Utterance text | Actual entities
(startPos, endPos, type, value) | Expected entities
(startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Schedule meeting with bob tomorrow| | 23, 25, personName, bob
26, 33, datetime, tomorrow **False Negative** for personName and datetime since they don’t have a corresponding match in the actual entities26, 33, datetime, tomorrow -| Utterance text | Actual entities (startPos, endPos, type, value) | Expected entities (startPos, endPos, type, value) | +| Utterance text | Actual entities
(startPos, endPos, type, value) | Expected entities
(startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Good Morning| | **True Negative:** There are no expected entities and no entities identified +## Confusion matrix output for text comparison +We compute the text results for speech tests only. + +|Actual utterance text | Expected utterance text | +|:---------------------:| :-----------------------:| +|Good morning, Cortana | Good morning, Cortana | + +**True Positive:** Text equality between normalized actual text and expected. Where consecutive spaces have been normalized to one space, punctuation has been removed and case is ignored + +|Actual utterance text | Expected utterance text | +|:---------------------:| :-----------------------:| +|Good morning, Courtney | Good morning cortana | + +**False Positive:** texts are not equal + +|Actual utterance text | Expected utterance text | +|:---------------------:| :-----------------------:| +|empty | empty | + + +**True Negative:** Both utterances are empty + +|Actual utterance text | Expected utterance text | +|:---------------------:| :-----------------------:| +|empty | Good morning, Cortana | + +**False Negative:** Only the actual utterance text is empty + ## Confusion matrix output for entity value comparison From f9f99c4c2dde8d52c5c3c2935e5ccad5fee9c179 Mon Sep 17 00:00:00 2001 From: saroup Date: Thu, 13 Feb 2020 20:10:34 -0500 Subject: [PATCH 4/4] minor modifications --- docs/ConfusionMatrixCalculations.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/ConfusionMatrixCalculations.md b/docs/ConfusionMatrixCalculations.md index 92389c9..56450ea 100644 --- a/docs/ConfusionMatrixCalculations.md +++ b/docs/ConfusionMatrixCalculations.md @@ -57,6 +57,7 @@ When the expected list of entities is not empty, we check that each entity match | Utterance text | Actual entities
(startPos, endPos, type, value) | Expected entities
(startPos, endPos, type, value) | |:---------------------: | :------------: |:---------------:| |Schedule meeting with bob | 17, 24, personName, bob| 22, 24, personName, bob| + **False Positive** for personName since the start position doesn’t match with the expected one. ### Producing two results