π
<-
Chat plein-écran
[^]

Integration tests on 3 top-model CAS systems

Discussions scientifiques et scolaires

Integration tests on 3 top-model CAS systems

Message non lude quinyu » 07 Juin 2015, 22:43

311 indefinite integration problems were tested on TI-Nspire CX CAS, HP Prime and Casio ClassPad II emulators, with latest OSes available to us. The quick summary:

  • The Casio ClassPad II solved 69% of the integrals correctly, the TI-Nspire CX CAS 71% and the HP Prime 81%.
  • Using a confidence level of p=0.95, it is safe to state that the HP Prime performed significantly better than the other two calculators tested.
  • We would love if the respective manufacturers would fix the issues.

The detailed report can be found here: http://tiplanet.org/modules/archives/download.php?id=251888

No funding of any sort was received for this ongoing test. We claim no conflict of interest. No rabbits were harmed in the procedure. Yet. ;~)
Avatar de l’utilisateur
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Prochain niv.: 48%
 
Messages: 9
Inscription: 07 Juin 2015, 20:06
Genre: Homme
Calculatrice(s):
MyCalcs profile
Classe: low

Re: Integration tests on 3 top-model CAS systems

Message non lude Excale » 07 Juin 2015, 22:50

Nice :).

You put the result in red when it was wrong or unsuccessful. Did you also count the number of wrong (1+1 -> 3 is wrong; 1+2 -> 4-1 is correct (although it's not what you want) answer for each calculator?
Avatar de l’utilisateur
ExcaleAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Prochain niv.: 3.9%
 
Messages: 2955
Images: 3
Inscription: 10 Sep 2010, 00:00
Genre: Homme
Calculatrice(s):
MyCalcs profile

Re: Integration tests on 3 top-model CAS systems

Message non lude quinyu » 07 Juin 2015, 23:03

As long as the derivative of the answer given by the calculator is identical to the expression that was integrated, it was counted as correct; otherwise (or if the calculator froze/rebooted/started doing weird things) as a fail.

There is no one integration result (as for example, you can rewrite the hyperbolic functions in terms of logarithms, and that's just one example out of hundreds), but they should come to the same derivative all the same (that is: given Int(f(x),x)=g(x), and f(x)-deriv(g(x),x)=0, then it's good. Simplifications and rewrites were taken into account.) If not, then the integration is wrong. Luckily, finding a derivative (like the checking requires) is much simpler and quicker than integrating (this can be proven; less simple on complex numbers, but still).

I have in some places used blue as well (spot them all and figure what was meant :P)

So answering your question: 1+1 -> 2 was considered as correct, just like 1+2 -> 4-1. At places I complained about the bulkiness of the results (don't we all?), but as long as it was a closed form and could be shown to give the same derivative, they were accepted.
Avatar de l’utilisateur
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Prochain niv.: 48%
 
Messages: 9
Inscription: 07 Juin 2015, 20:06
Genre: Homme
Calculatrice(s):
MyCalcs profile
Classe: low

Re: Integration tests on 3 top-model CAS systems

Message non lude Adriweb » 07 Juin 2015, 23:04

Nice document indeed!

Bernard Parisse is reading TI-Planet so I'm sure he'll stumble upon this topic sooner or later, but in the meantime I'm going to share this to TI, maybe it can be helpful to improve the CAS engine :)

(one more thing : it would have been fun to add Wolfram Mathematica as another CAS engine, it probably would have obliterated all 3 calcs :P)
Image

MyCalcs: Help the community's calculator documentations by filling out your calculators info!
MyCalcs: Aidez la communauté à documenter les calculatrices en donnant des infos sur vos calculatrices !
Inspired-Lua.org: All about TI-Nspire Lua programming (tutorials, wiki/docs...)
Avatar de l’utilisateur
AdriwebAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Prochain niv.: 80.2%
 
Messages: 14616
Images: 1218
Inscription: 01 Juin 2007, 00:00
Localisation: France
Genre: Homme
Calculatrice(s):
MyCalcs profile
Twitter/X: adriweb
GitHub: adriweb

Re: Integration tests on 3 top-model CAS systems

Message non lude Excale » 07 Juin 2015, 23:06

I was more thinking about putting it in red when it answered with an integral.

It's a fail, but not a wrong result.

And... I really prefer to get a fail over a wrong result.

Edit:
The other fun facts is that if you combine all 3 calcs, you get a very good result. So: buy all of them :P.
Avatar de l’utilisateur
ExcaleAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Prochain niv.: 3.9%
 
Messages: 2955
Images: 3
Inscription: 10 Sep 2010, 00:00
Genre: Homme
Calculatrice(s):
MyCalcs profile

Re: Integration tests on 3 top-model CAS systems

Message non lude quinyu » 07 Juin 2015, 23:22

As of the TI, I don't know, I keep on submitting these stuffs (about once every 100 integrals covered) to TI as well as Casio (couldn't find a mail address for HP yet); and as of Wolfram Mathematica, you would be surprised. It can sometimes go very wrong. RUBI is my choice of integrator there.

As of a partial result - I still count it as a bad thing since the calculator throws the bit that was ultimately unsolvable for it back on us. It would remain that way.

Done the statistics check: 33 of the 311 problems were uncrackable for any of the three calculators. That's about a tenth of the problems. And currently I'm pretty damn fine with my FX-991DE plus - not to mention programmable and graphical calcs are not permitted in my school, at least not in the tests. No reason to invest in any of the three. Maybe later.
Avatar de l’utilisateur
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Prochain niv.: 48%
 
Messages: 9
Inscription: 07 Juin 2015, 20:06
Genre: Homme
Calculatrice(s):
MyCalcs profile
Classe: low

Re: Integration tests on 3 top-model CAS systems

Message non lude Adriweb » 07 Juin 2015, 23:28

quinyu a écrit:As of the TI, I don't know, I keep on submitting these stuffs (about once every 100 integrals covered) to TI as well as Casio (couldn't find a mail address for HP yet);

For TI, I (and some more people here) happen to know some TIers directly, so we can report bugs etc. directly to them (instead of going through TI-Cares etc.).
For HP, the CAS engine of the Prime is giac/xcas, which is developed by Bernard Parisse, which is a member of this forum :)
And I don't know about Casio.

By the way, maybe I skipped/didn't see it in the .pdf but did you use the student software for the Nspire tests, or an actual device ?
I have actually developed an REPL for the Nspire's CAS (even though it is not public yet), which allows to tests several dozens (hundreds?) of calculations per second, actually (well, it can work either as a REPL, or take input from a file and the output will be in stdout). That would probably help for a test suite.
And I suppose that having some kind of a repl/commandline interface for giac is trivial to get.
Image

MyCalcs: Help the community's calculator documentations by filling out your calculators info!
MyCalcs: Aidez la communauté à documenter les calculatrices en donnant des infos sur vos calculatrices !
Inspired-Lua.org: All about TI-Nspire Lua programming (tutorials, wiki/docs...)
Avatar de l’utilisateur
AdriwebAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Prochain niv.: 80.2%
 
Messages: 14616
Images: 1218
Inscription: 01 Juin 2007, 00:00
Localisation: France
Genre: Homme
Calculatrice(s):
MyCalcs profile
Twitter/X: adriweb
GitHub: adriweb

Re: Integration tests on 3 top-model CAS systems

Message non lude quinyu » 07 Juin 2015, 23:36

Only software for all three. But since I don't like time limited options, that's kArmTI running there for the TI. Casio and HP run the manufacturer-released emus (with Casio running virtualised.) And as of hundreds of calculations per second - most of the time is the actual typing, so it's nice but wouldn't help me much. Thanks for mentioning anyhow. Casio's emu is lagging one release behind as compared to the real deal, but since I found no way to insert the new OS, I'm letting it hang in the air for now.
Avatar de l’utilisateur
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Prochain niv.: 48%
 
Messages: 9
Inscription: 07 Juin 2015, 20:06
Genre: Homme
Calculatrice(s):
MyCalcs profile
Classe: low

Re: Integration tests on 3 top-model CAS systems

Message non lude Bisam » 07 Juin 2015, 23:39

I can see many issues, in the paper :
  • Why is TI Nspire's answer for #18 not accepted ? I suppose that it is because of automatic verification... but it is perfectly correct.
  • The same for #39...
  • #50 is counted wrong for both Nspire and Classpad... when it is correct for both. Why is that ?
  • #52 and #53 are again correct for Nspire but counted as wrong
  • #68 is counted as a fail where it should be the best answer !! the other two answers are wrong when n is -1. However, the Nspire doesn't give an answer even if n is specified to be positive for example.

I didn't run through all answers but I'd like to know the reasons for excluding some good answers...
Avatar de l’utilisateur
BisamAdmin
Niveau 15: CC (Chevalier des Calculatrices)
Niveau 15: CC (Chevalier des Calculatrices)
Prochain niv.: 69.6%
 
Messages: 5665
Inscription: 11 Mar 2008, 00:00
Localisation: Lyon
Genre: Homme
Calculatrice(s):
MyCalcs profile

Re: Integration tests on 3 top-model CAS systems

Message non lude Adriweb » 07 Juin 2015, 23:45

quinyu a écrit:Only software for all three. But since I don't like time limited options, that's kArmTI running there for the TI.

I can't help but mention Firebird Emu, now that it's out :)

Casio and HP run the manufacturer-released emus (with Casio running virtualized.)

Note: the official software(s) are actually simulators, not emulators (they [try to] reproduce the software's behaviour (by compiling the source code for the desktop architecture and not the calc's), not the hardware, like an emulator would do)

quinyu a écrit:And as of hundreds of calculations per second - most of the time is the actual typing, so it's nice but wouldn't help me much. Thanks for mentioning anyhow.

Well, that would precisely allow you to have all the tests in a .txt file, and running all the tests comparing the output to the expected result would allow you to get test results in a matter of seconds, that's infinitely faster than typing every single one by hand and comparing the output manually :P
Image

MyCalcs: Help the community's calculator documentations by filling out your calculators info!
MyCalcs: Aidez la communauté à documenter les calculatrices en donnant des infos sur vos calculatrices !
Inspired-Lua.org: All about TI-Nspire Lua programming (tutorials, wiki/docs...)
Avatar de l’utilisateur
AdriwebAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Prochain niv.: 80.2%
 
Messages: 14616
Images: 1218
Inscription: 01 Juin 2007, 00:00
Localisation: France
Genre: Homme
Calculatrice(s):
MyCalcs profile
Twitter/X: adriweb
GitHub: adriweb

Suivante

Retourner vers Maths, physique, informatique et autre...

Qui est en ligne

Utilisateurs parcourant ce forum: Aucun utilisateur enregistré et 23 invités

-
Rechercher
-
Social TI-Planet
-
Sujets à la une
Comparaisons des meilleurs prix pour acheter sa calculatrice !
Aidez la communauté à documenter les révisions matérielles en listant vos calculatrices graphiques !
Phi NumWorks jailbreak
123
-
Faire un don / Premium
Pour plus de concours, de lots, de tests, nous aider à payer le serveur et les domaines...
Faire un don
Découvrez les avantages d'un compte donateur !
JoinRejoignez the donors and/or premium!les donateurs et/ou premium !


Partenaires et pub
Notre partenaire Jarrety Calculatrices à acheter chez Calcuso
-
Stats.
1079 utilisateurs:
>1043 invités
>32 membres
>4 robots
Record simultané (sur 6 mois):
6892 utilisateurs (le 07/06/2017)
-
Autres sites intéressants
Texas Instruments Education
Global | France
 (English / Français)
Banque de programmes TI
ticalc.org
 (English)
La communauté TI-82
tout82.free.fr
 (Français)