Source

Cassandra / NEWS.txt

Full commit
   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
GENERAL UPGRADING ADVICE FOR ANY VERSION
========================================

Snapshotting is fast (especially if you have JNA installed) and takes
effectively zero disk space until you start compacting the live data
files again.  Thus, best practice is to ALWAYS snapshot before any
upgrade, just in case you need to roll back to the previous version.
(Cassandra version X + 1 will always be able to read data files created
by version X, but the inverse is not necessarily the case.)


1.2
===

Upgrading
---------
    - IAuthority interface has been deprecated in favor of IAuthorizer.
      AllowAllAuthority and SimpleAuthority have been renamed to
      AllowAllAuthorizer and SimpleAuthorizer, respectively. In order to
      simplify the upgrade to the new interface, a new abstract
      LegacyAuthorizer has been added - you should subclass it in your
      old IAuthority implementation and everything should just work
      (this only affects users who implemented custom authorities).
      'authority' setting in cassandra.yaml has been renamed to 'authorizer',
      'authority' is no longer recognized. This affects all upgrading users.
    - 1.2 is NOT network-compatible with versions older than 1.0. That
      means if you want to do a rolling, zero-downtime upgrade, you'll need
      to upgrade first to 1.0.x or 1.1.x, and then to 1.2.  1.2 retains
      the ability to read data files from Cassandra versions at least
      back to 0.6, so a non-rolling upgrade remains possible with just
      one step.
    - The default partitioner for new clusters is Murmur3Partitioner,
      which is about 10% faster for index-intensive workloads.  Partitioners
      cannot be changed once data is in the cluster, however, so if you are
      switching to the 1.2 cassandra.yaml, you should change this to
      RandomPartitioner or whatever your old partitioner was.
    - If you using counters and upgrading from a version prior to
      1.1.6, you should drain existing Cassandra nodes prior to the
      upgrade to prevent overcount during commitlog replay (see
      CASSANDRA-4782).  For non-counter uses, drain is not required
      but is a good practice to minimize restart time.
    - Tables using LeveledCompactionStrategy will default to not
      creating a row-level bloom filter.  The default in older versions
      of Cassandra differs; you should manually set the false positive
      rate to 1.0 (to disable) or 0.01 (to enable, if you make many
      requests for rows that do not exist).
    - The hints schema was changed from 1.1 to 1.2. Cassandra automatically
      snapshots and then truncates the hints column family as part of
      starting up 1.2 for the first time.  Additionally, upgraded nodes
      will not store new hints destined for older (pre-1.2) nodes. It is
      therefore recommended that you perform a cluster upgrade when all
      nodes are up.
    - The `nodetool removetoken` command (and corresponding JMX operation)
      have been renamed to `nodetool removenode`.  This function is
      incompatible with the earlier `nodetool removetoken`, and attempts to
      remove nodes in this way with a mixed 1.1 (or lower) / 1.2 cluster,
      is not supported.
    - The somewhat ill-conceived CollatingOrderPreservingPartitioner
      has been removed. Use Murmur3Partitioner (recommended) or
      ByteOrderedPartitioner instead.
    - Global option hinted_handoff_throttle_delay_in_ms has been removed.
      hinted_handoff_throttle_in_kb has been added instead.
    - The default bloom filter fp chance has been increased to 1%.
      This will save about 30% of the memory used by the old default.
      Existing columnfamilies will retain their old setting.
    - The default partitioner (for new clusters; the partitioner cannot be
      changed in existing clusters) was changed from RandomPartitioner to
      Murmur3Partitioner which provides faster hashing as well as improved
      performance with secondary indexes.
    - The default version of CQL (and cqlsh) is now CQL3. CQL2 is still
      available but you will have to use the thrift set_cql_version method
      (that is already supported in 1.1) to use CQL2. For cqlsh, you will need
      to use 'cqlsh -2'.
    - CQL3 is now considered final in this release. Compared to the beta
      version that is part of 1.1, this final version has a few additions
      (collections), but also some (incompatible) changes in the syntax for the
      options of the create/alter keyspace/table statements. Typically, the
      syntax to create a keyspace is now:
        CREATE KEYSPACE ks WITH replication = { 'class' : 'SimpleStrategy',
                                                'replication_factor' : 2 };
      Also, the consistency level cannot be set in the language anymore, but is
      at the protocol level.
      Please refer to the CQL3 documentation for details.
    - In CQL3, the DROP behavior from ALTER TABLE has currently been removed
      (because it was not correctly implemented). We hope to add it back soon
      (Cassandra 1.2.1 or 1.2.2)

Features
--------
    - Cassandra can now handle concurrent CREATE TABLE schema changes
      as well as other updates
    - rpc_timeout has been split up to allow finer-grained control
      on timeouts for different operation types
    - num_tokens can now be specified in cassandra.yaml. This defines the
      number of tokens assigned to the host on the ring (default: 1).
      Also specifying initial_token will override any num_tokens setting.
    - disk_failure_policy allows blacklisting failed disks in JBOD 
      configuration instead of erroring out indefinitely
    - event tracing can be configured per-connection ("trace_next_query")
      or globally/probabilistically ("nodetool settraceprobability")
    - Atomic batches are now supported server side, where Cassandra will
      guarantee that (at the price of pre-writing the batch to another node
      first), all mutations in the batch will be applied, even if the
      coordinator fails mid-batch.
    - new IAuthorizer interface has replaced the old IAuthority. IAuthorizer
      allows dynamic permission management via new CQL3 statements:
      GRANT, REVOKE, LIST PERMISSIONS. A native implementation storing
      the permissions in Cassandra is being worked on and we expect to
      include it in 1.2.1 or 1.2.2.


1.1.5
=====

Upgrading
---------
    - Nothing specific to this release, but please see 1.1 if you are upgrading
      from a previous version.


1.1.4
=====

Upgrading
---------
    - Nothing specific to this release, but please see 1.1 if you are upgrading
      from a previous version.


1.1.3
=====

Upgrading
---------
    - Running "nodetool upgradesstables" after upgrading is recommended
      if you use Counter columnfamilies.

Features
--------
    - the cqlsh COPY command can now export to CSV flat files
    - added a new tools/bin/token-generator to facilitate generating evenly distributed tokens


1.1.2
=====

Upgrading
---------
    - Nothing specific to this release, but please see 1.1 if you are upgrading
      from a previous version.

Features
--------
    - cqlsh has a new COPY command to load data from CSV flat files


1.1.1
=====

Upgrading
---------
    - Nothing specific to this release, but please see 1.1 if you are upgrading
      from a previous version.

Features
--------
    - Continuous commitlog archiving and point-in-time recovery.
      See conf/commitlog_archiving.properties
    - Incremental repair by token range, exposed over JMX


1.1
===

Upgrading
---------
    - Compression is enabled by default on newly created ColumnFamilies
      (and unchanged for ColumnFamilies created prior to upgrading).
    - If you are running a multi datacenter setup, you should upgrade to
      the latest 1.0.x (or 0.8.x) release before upgrading.  Versions
      0.8.8 and 1.0.3-1.0.5 generate cross-dc forwarding that is incompatible
      with 1.1.
    - EACH_QUORUM ConsistencyLevel is only supported for writes and will now
      throw an InvalidRequestException when used for reads.  (Previous
      versions would silently perform a LOCAL_QUORUM read instead.)
    - ANY ConsistencyLevel is only supported for writes and will now
      throw an InvalidRequestException when used for reads.  (Previous
      versions would silently perform a ONE read for range queries;
      single-row and multiget reads already rejected ANY.)
    - The largest mutation batch accepted by the commitlog is now 128MB.  
      (In practice, batches larger than ~10MB always caused poor
      performance due to load volatility and GC promotion failures.)
      Larger batches will continue to be accepted but will not be
      durable.  Consider setting durable_writes=false if you really
      want to use such large batches.
    - Make sure that global settings: key_cache_{size_in_mb, save_period}
      and row_cache_{size_in_mb, save_period} in conf/cassandra.yaml are
      used instead of per-ColumnFamily options.
    - JMX methods no longer return custom Cassandra objects.  Any such methods
      will now return standard Maps, Lists, etc.
    - Hadoop input and output details are now separated.  If you were
      previously using methods such as getRpcPort you now need to use
      getInputRpcPort or getOutputRpcPort depending on the circumstance.
    - CQL changes:
      + Prior to 1.1, you could use KEY as the primary key name in some
        select statements, even if the PK was actually given a different
        name.  In 1.1+ you must use the defined PK name.
    - The sliced_buffer_size_in_kb option has been removed from the
      cassandra.yaml config file (this option was a no-op since 1.0).

Features
--------
    - Concurrent schema updates are now supported, with any conflicts
      automatically resolved. Please note that simultaneously running
      ‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t
      be safe until version 1.2 due to the nature of ColumnFamily
      identifier generation, for more details see CASSANDRA-3794.
    - The CQL language has undergone a major revision, CQL3, the
      highlights of which are covered at [1].  CQL3 is not
      backwards-compatibile with CQL2, so we've introduced a
      set_cql_version Thrift method to specify which version you want.
      (The default remains CQL2 at least until Cassandra 1.2.)  cqlsh
      adds a --cql3 flag to enable this.
      [1] http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
    - Row-level isolation: multi-column updates to a single row have
      always been *atomic* (either all will be applied, or none)
      thanks to the CommitLog, but until 1.1 they were not *isolated*
      -- a reader may see mixed old and new values while the update
      happens.
    - Finer-grained control over data directories, allowing a ColumnFamily to
      be pinned to specfic volume, e.g. one backed by SSD.
    - The bulk loader is not longer a fat client; it can be run from an
      existing machine in a cluster.
    - A new write survey mode has been added, similar to bootstrap (enabled via
      -Dcassandra.write_survey=true), but the node will not automatically join
      the cluster.  This is useful for cases such as testing different
      compaction strategies with live traffic without affecting the cluster.
    - Key and row caches are now global, similar to the global memtable
      threshold. Manual tuning of cache sizes per-columnfamily is no longer
      required.
    - Off-heap caches no longer require JNA, and will work out of the box
      on Windows as well as Unix platforms.
    - Streaming is now multithreaded.
    - Compactions may now be aborted via JMX or nodetool.
    - The stress tool is not new in 1.1, but it is newly included in
      binary builds as well as the source tree
    - Hadoop: a new BulkOutputFormat is included which will directly write
      SSTables locally and then stream them into the cluster.
      YOU SHOULD USE BulkOutputFormat BY DEFAULT.  ColumnFamilyOutputFormat
      is still around in case for some strange reason you want results
      trickling out over Thrift, but BulkOutputFormat is significantly
      more efficient.
    - Hadoop: KeyRange.filter is now supported with ColumnFamilyInputFormat,
      allowing index expressions to be evaluated server-side to reduce
      the amount of data sent to Hadoop.
    - Hadoop: ColumnFamilyRecordReader has a wide-row mode, enabled via
      a boolean parameter to setInputColumnFamily, that pages through
      data column-at-a-time instead of row-at-a-time.
    - Pig: can use the wide-row Hadoop support, by setting PIG_WIDEROW_INPUT
      to true.  This will produce each row's columns in a bag.



1.0.8
=====

Upgrading
---------
    - Nothing specific to 1.0.8

Other
-----
    - Allow configuring socket timeout for streaming


1.0.7
=====

Upgrading
---------
    - Nothing specific to 1.0.7, please report to instruction for 1.0.6

Other
-----
    - Adds new setstreamthroughput to nodetool to configure streaming
      throttling
    - Adds JMX property to get/set rpc_timeout_in_ms at runtime
    - Allow configuring (per-CF) bloom_filter_fp_chance


1.0.6
=====

Upgrading
---------
    - This release fixes an issue related to the chunk_length_kb option for
      compressed sstables. If you use compression on some column families, it
      is recommended after the upgrade to check the value for this option on
      these column families (the default value is 64). In case the option would
      not be set correctly, you should update the column family definition,
      setting the right value and then run scrub on the column family.
    - Please report to instruction for 1.0.5 if coming from an older version.


1.0.5
=====

Upgrading
---------
    - 1.0.5 comes to fix two important regression of 1.0.4. So all information
      concerning 1.0.4 are valid for this release, but please avoids upgrading
      to 1.0.4.


1.0.4
=====

Upgrading
---------
    - Nothing specific to 1.0.4 but please see the 1.0 upgrading section if
      upgrading from a version prior to 1.0.0

Features
--------
    - A new upgradesstables command has been added to nodetool. It is very
      similar to scrub but without the ability to discard corrupted rows (and
      as a consequence it does not snapshot automatically before). This new
      command is to be prefered to scrub in all cases where sstables should be
      rewritten to the current format for upgrade purposes.

JMX
---
    - The path for the data, commit log and saved cache directories exposed
      through JMX
    - The in-memory bloom filter sizes are now exposed through JMX


1.0.3
=====

Upgrading
---------
    - Nothing specific to 1.0.3 but please see the 1.0 upgrading section if
      upgrading from a version prior to 1.0.0

Features
--------
    - For non compressed sstables (compressed sstable already include more
      fine grained checsums), a sha1 for the full sstable is now automatically
      created (in a fix with suffix -Digest.sha1). It can be used to check the
      sstable integrity with sha1sum.


1.0.2
=====

Upgrading
---------
    - Nothing specific to 1.0.2 but please see the 1.0 upgrading section if
      upgrading from a version prior to 1.0.0

Features
--------
    - Cassandra CLI queries now have timing information


1.0.1
=====

Upgrading
---------
    - If upgrading from a version prior to 1.0.0, please see the 1.0 Upgrading
      section
    - For running on Windows as a Service, procrun is no longer discributed
      with Cassandra, see README.txt for more information on how to download
      it if necessary.
    - The name given to snapshots directories have been improved for human
      readability. If you had scripts relying on it, you may need to update
      them.


1.0
===

Upgrading
---------
    - Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
      restart, one node at a time.  (0.8.0 or 0.8.1 are NOT network-compatible
      with 1.0: upgrade to the most recent 0.8 release first.)
      You do not need to bring down the whole cluster at once. 
    - After upgrading, run nodetool scrub against each node before running
      repair, moving nodes, or adding new ones.
    - CQL inserts/updates now generate microsecond resolution timestamps
      by default, instead of millisecond. THIS MEANS A ROLLING UPGRADE COULD
      MIX milliseconds and microseconds, with clients talking to servers
      generating milliseconds unable to overwrite the larger microsecond
      timestamps. If you are using CQL and this is important for your
      application, you can either perform a non-rolling upgrade to 1.0, or
      update your application first to use explicit timestamps with the "USING
      timestamp=X" syntax.
    - The BinaryMemtable bulk-load interface has been removed (use the
      sstableloader tool instead).
    - The compaction_thread_priority setting has been removed from
      cassandra.yaml (use compaction_throughput_mb_per_sec to throttle
      compaction instead).
    - CQL types bytea and date were renamed to blob and timestamp, respectively,
      to conform with SQL norms.  CQL type int is now a 4-byte int, not 8
      (which is still available as bigint).
    - Cassandra 1.0 uses arena allocation to reduce old generation
      fragmentation.  This means there is a minimum overhead of 1MB
      per ColumnFamily plus 1MB per index.
    - The SimpleAuthenticator and SimpleAuthority classes have been moved to
      the example directory (and are thus not available from the binary
      distribution). They never provided actual security and in their current
      state are only meant as examples.

Features
--------
    - SSTable compression is supported through the 'compression_options'
      parameter when creating/updating a column family. For instance, you can
      create a column family Cf using compression (through the Snappy library)
      in the CLI with:
        create column family Cf with compression_options={sstable_compression: SnappyCompressor}
      SSTable compression is not activated by default but can be activated or
      deactivated at any time.
    - Compressed SSTable blocks are checksummed to protect against bitrot
    - New LevelDB-inspired compaction algorithm can be enabled by setting the
      Columnfamily compaction_strategy=LeveledCompactionStrategy option.
      Leveled compaction means you only need to keep a few MB of space free for
      compaction instead of (in the worst case) 50%.
    - Ability to use multiple threads during a single compaction. See
      multithreaded_compaction in cassandra.yaml for more details.
    - Windows Service ("cassandra.bat install" to enable)
    - A dead node may be replaced in a single step by starting a new node
      with -Dcassandra.replace_token=<token>. More details can be found at
      http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
    - It is now possible to repair only the first range returned by the
      partitioner for a node with `nodetool repair -pr`. It makes it
      easier/possible to repair a full cluster without any work duplication by
      running this command on every node of the cluster.

New data types
--------------
    - decimal

Other
-----
    - Hinted Handoff has two major improvements:
        - Hint replay is much more efficient thanks to a change in the data model
        - Hints are created for all replicas that do not ack a write.  (Formerly,
          only replicas known to be down when the write started were hinted.)
      This means that running with read repair completely off is much more
      viable than before, and the default read_repair_chance is reduced from 1.0
      ("always repair") to 0.1 ("repair 10% of the time").
    - The old per-ColumnFamily memtable thresholds
      (memtable_throughput_in_mb, memtable_operations_in_millions,
      memtable_flush_after_mins) are ignored, in favor of the global
      memtable_total_space_in_mb and commitlog_total_space_in_mb settings.
      This does not affect client compatibility -- the old options are
      still allowed, but have no effect. These options may be removed
      entirely in a future release.
    - Backlogged compactions will begin five minutes after startup.  The 0.8
      behavior of never starting compaction until a flush happens is usually
      not what is desired, but a short grace period is useful to allow caches
      to warm up first.
    - The deletion of compacted data files is not performed during Garbage
      Collection anymore. This means compacted files will now be deleted
      without delay.


0.8.5
=====

Features
--------
    - SSTables copied to a data directory can be loaded by a live node through
      nodetool refresh (may be handy to load snapshots).
    - The configured compaction throughput is exposed through JMX.

Other
-----
    - The sstableloader is now bundled with the debian package.
    - Repair detects when a participating node is dead and fails instead of
      hanging forever.


0.8.4
=====

Upgrading
---------
    - Nothing specific to 0.8.4

Other
-----
    - This release comes to fix a bug in counter that could lead to
      (important) over-count.
    - It also fixes a slight upgrade regression from 0.8.3. It is thus advised
      to jump directly to 0.8.4 if upgrading from before 0.8.3.


0.8.3
=====

Upgrading
---------
    - Token removal has been revamped.  Removing tokens in a mixed cluster with
      0.8.3 will not work, so the entire cluster will need to be running 0.8.3
      first, except for the dead node.

Features
--------
    - It is now possible to use thrift asynchronous and
      half-synchronous/half-asynchronous servers (see cassandra.yaml for more
      details).
    - It is now possible to access counter columns through Hadoop.

Other
-----
    - This release fix a regression of 0.8 that can make commit log segment to
      be deleted even though not all data it contains has been flushed.
      Upgrades from 0.8.* is very much encouraged.


0.8.2
=====

Upgrading
---------
    - 0.8.0 and 0.8.1 shipped with a bug that was setting the
      replicate_on_write option for counter column families to false (this
      option has no effect on non-counter column family). This is an unsafe
      default and 0.8.2 correct this, the default for replicate_on_write is
      now true. It is advised to update your counter column family definitions
      if replicate_on_write was uncorrectly set to false (before or after
      upgrade).


0.8.1
=====

Upgrading
---------
    - 0.8.1 is backwards compatible with 0.8, upgrade can be achieved by a
      simple rolling restart.
    - If upgrading for earlier version (0.7), please refer to the 0.8 section
      for instructions.

Features
--------
    - Numerous additions/improvements to CQL (support for counters, TTL, batch
      inserts/deletes, index dropping, ...).
    - Add two new AbstractTypes (comparator) to support compound keys
      (CompositeType and DynamicCompositeType), as well as a ReverseType to
      reverse the order of any existing comparator.
    - New option to bypass the commit log on some keyspaces (for advanced
      users).

Tools
-----
    - Add new data bulk loading utility (sstableloader).


0.8
===

Upgrading
---------
    - Upgrading from version 0.7.1 or later can be done with a rolling
      restart, one node at a time.  You do not need to bring down the
      whole cluster at once. 
    - After upgrading, run nodetool scrub against each node before running
      repair, moving nodes, or adding new ones.
    - Running nodetool drain before shutting down the 0.7 node is
      recommended but not required. (Skipping this will result in
      replay of entire commitlog, so it will take longer to restart but
      is otherwise harmless.)
    - 0.8 is fully API-compatible with 0.7.  You can continue
      to use your 0.7 clients.
    - Avro record classes used in map/reduce and Hadoop streaming code have
      been removed. Map/reduce can be switched to Thrift by changing
      org.apache.cassandra.avro in import statements to 
      org.apache.cassandra.thrift (no class names change). Streaming support 
      has been removed for the time being.
    - The loadbalance command has been removed from nodetool.  For similar
      behavior, decommission then rebootstrap with empty initial_token.
    - Thrift unframed mode has been removed.
    - The addition of key_validation_class means the cli will assume keys
      are bytes, instead of strings, in the absence of other information.
      See http://wiki.apache.org/cassandra/FAQ#cli_keys for more details.


Features
--------
    - added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
      Python, respectively (see: drivers/ subdirectory and doc/cql)
    - added distributed Counters feature; 
      see http://wiki.apache.org/cassandra/Counters
    - optional intranode encryption; see comments around 'encryption_options'
      in cassandra.yaml
    - compaction multithreading and rate-limiting; see 
      'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in
      cassandra.yaml
    - cassandra will limit total memtable memory usage to 1/3 of the heap
      by default.  This can be ajusted or disabled with the 
      memtable_total_space_in_mb option.  The old per-ColumnFamily
      throughput, operations, and age settings are still respected but
      will be removed in a future major release once we are satisfied that
      memtable_total_space_in_mb works adequately.

Tools
-----
    - stress and py_stress moved from contrib/ to tools/
    - clustertool was removed (see 
      https://issues.apache.org/jira/browse/CASSANDRA-2607 for examples
      of how to script nodetool across the cluster instead)

Other
-----
    - In the past, sstable2json would write column names and values as
      hex strings, and now creates human readable values based on the
      comparator/validator.  As a result, JSON dumps created with
      older versions of sstable2json are no longer compatible with
      json2sstable, and imports must be made with a configuration that
      is identical to the export.
    - manually-forced compactions ("nodetool compact") will do nothing
      if only a single SSTable remains for a ColumnFamily. To force it
      to compact that anyway (which will free up space if there are
      a lot of expired tombstones), use the new forceUserDefinedCompaction
      JMX method on CompactionManager.
    - most of contrib/ (which was not part of the binary releases)
      has been moved either to examples/ or tools/. We plan to move the
      rest for 0.8.1.

JMX
---
    - By default, JMX now listens on port 7199.


0.7.6
=====

Upgrading
---------
    - Nothing specific to 0.7.6, but see 0.7.3 Upgrading if upgrading
      from earlier than 0.7.1.


0.7.5
=====

Upgrading
---------
    - Nothing specific to 0.7.5, but see 0.7.3 Upgrading if upgrading
      from earlier than 0.7.1.

Changes
-------
    - system_update_column_family no longer snapshots before applying
      the schema change. (_update_keyspace never did.  _drop_keyspace
      and _drop_column_family continue to snapshot.)
    - added memtable_flush_queue_size option to cassandra.yaml to
      avoid blocking writes when multiple column families (or a colum
      family with indexes) are flushed at the same time.
    - allow overriding initial_token, storage_port and rpc_port using
      system properties


0.7.4
=====

Upgrading
---------
    - Nothing specific to 0.7.4, but see 0.7.3 Upgrading if upgrading
      from earlier than 0.7.1.

Features
--------
    - Output to Pig is now supported as well as input


0.7.3
=====

Upgrading
---------
    - 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level
      bloom filters to be generated when compacting sstables generated
      with earlier versions.  This would manifest in IOExceptions during
      column name-based queries.  0.7.3 provides "nodetool scrub" to 
      rebuild sstables with correct bloom filters, with no data lost.
      (If your cluster was never on 0.7.0 or earlier, you don't have to
      worry about this.)  Note that nodetool scrub will snapshot your
      data files before rebuilding, just in case.


0.7.1
=====

Upgrading
---------
    - 0.7.1 is completely backwards compatible with 0.7.0.  Just restart
      each node with the new version, one at a time.  (The cluster does
      not all need to be upgraded simultaneously.)

Features
--------
    - added flush_largest_memtables_at and reduce_cache_sizes_at options
      to cassandra.yaml as an escape valve for memory pressure
    - added option to specify -Dcassandra.join_ring=false on startup
      to allow "warm spare" nodes or performing JMX maintenance before
      joining the ring

Performance
-----------
    - Disk writes and sequential scans avoid polluting page cache
      (requires JNA to be enabled)
    - Cassandra performs writes efficiently across datacenters by
      sending a single copy of the mutation and having the recipient
      forward that to other replicas in its datacenter.
    - Improved network buffering
    - Reduced lock contention on memtable flush
    - Optimized supercolumn deserialization
    - Zero-copy reads from mmapped sstable files
    - Explicitly set higher JVM new generation size
    - Reduced i/o contention during saving of caches


0.7.0
=====

Features
--------
    - Secondary indexes (indexes on column values) are now supported
    - Row size limit increased from 2GB to 2 billion columns.  rows
      are no longer read into memory during compaction.
    - Keyspace and ColumnFamily definitions may be added and modified live
    - Streaming data for repair or node movement no longer requires 
      anticompaction step first
    - NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for 
      use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC.  See comments 
      in `cassandra.yaml.`
    - Optional per-Column time-to-live field allows expiring data without
      have to issue explicit remove commands
    - `truncate` thrift method allows clearing an entire ColumnFamily at once
    - Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out]
      support
    - Up to 8x faster reads from row cache
    - A new ByteOrderedPartitioner supports bytes keys with arbitrary content,
      and orders keys by their byte value.  This should be used in new
      deployments instead of OrderPreservingPartitioner.
    - Optional round-robin scheduling between keyspaces for multitenant
      clusters
    - Dynamic endpoint snitch mitigates the impact of impaired nodes
    - New `IntegerType`, faster than LongType and allows integers of 
      both less and more bits than Long's 64
    - A revamped authentication system that decouples authorization and 
      allows finer-grained control of resources.

Upgrading
---------
    The Thrift API has changed in incompatible ways; see below, and refer
    to http://wiki.apache.org/cassandra/ClientOptions for a list of
    higher-level clients that have been updated to support the 0.7 API.

    The Cassandra inter-node protocol is incompatible with 0.6.x
    releases (and with 0.7 beta1), meaning you will have to bring your
    cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes.
    
    The hints schema was changed from 0.6 to 0.7. Cassandra automatically
    snapshots and then truncates the hints column family as part of 
    starting up 0.7 for the first time.

    Keyspace and ColumnFamily definitions are stored in the system
    keyspace, rather than the configuration file.

    The process to upgrade is:
    1) run "nodetool drain" on _each_ 0.6 node.  When drain finishes (log
       message "Node is drained" appears), stop the process.
    2) Convert your storage-conf.xml to the new cassandra.yaml using 
       "bin/config-converter".  
    3) Rename any of your keyspace or column family names that do not adhere
       to the '^\w+' regex convention.
    4) Start up your cluster with the 0.7 version.
    5) Initialize your Keyspace and ColumnFamily definitions using 
       "bin/schematool <host> <jmxport> import".  _You only need to do 
       this to one node_.

Thrift API
----------
    - The Cassandra server now defaults to framed mode, rather than
      unframed.  Unframed is obsolete and will be removed in the next
      major release.
    - The Cassandra Thrift interface file has been updated for Thrift 0.5.
      If you are compiling your own client code from the interface, you
      will need to upgrade the Thrift compiler to match.
    - Row keys are now bytes: keys stored by versions prior to 0.7.0 will be
      returned as UTF-8 encoded bytes. OrderPreservingPartitioner and
      CollatingOrderPreservingPartitioner continue to expect that keys contain
      UTF-8 encoded strings, but RandomPartitioner now works on any key data.
    - keyspace parameters have been replaced with the per-connection
      set_keyspace method.
    - The return type for login() is now AccessLevel.
    - The get_string_property() method has been removed.
    - The get_string_list_property() method has been removed.

Configuraton
------------
    - Configuration file renamed to cassandra.yaml and log4j.properties to
      log4j-server.properties
    - PropertyFileSnitch configuration file renamed to 
      cassandra-topology.properties
    - The ThriftAddress and ThriftPort directives have been renamed to
      RPCAddress and RPCPort respectively.
    - EndPointSnitch was renamed to RackInferringSnitch.  A new SimpleSnitch
      has been added.
    - RackUnawareStrategy and RackAwareStrategy have been renamed to
      SimpleStrategy and OldNetworkTopologyStrategy, respectively.
    - RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb
    - GCGraceSeconds is now per-ColumnFamily instead of global
	- Keyspace and column family names that do not confirm to a '^\w+' regex
      are considered illegal.
    - Keyspace and column family definitions will need to be loaded via
      "bin/schematool <host> <jmxport> import".  _You only need to do this to
      one node_.
    - In addition to an authenticator, an authority must be configured as
      well. Users of SimpleAuthenticator should use SimpleAuthority for this
      value (the default is AllowAllAuthority, which corresponds with 
      AllowAllAuthenticator).
    - The format of access.properties has changed, see the sample configuration
      conf/access.properties for documentation on the new format.


JMX
---
    - StreamingService moved from o.a.c.streaming to o.a.c.service
    - GMFD renamed to GOSSIP_STAGE
    - {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize
      since it no longer has to wait til compaction to be computed

Other
-----
    - If extending AbstractType, make sure you follow the singleton pattern
      followed by Cassandra core AbstractType classes: provide a public
      static final variable called 'instance'.


0.6.6
=====

Upgrading
---------
    - As part of the cache-saving feature, a third directory
      (along with data and commitlog) has been added to the config
      file.  You will need to set and create this directory
      when restarting your node into 0.6.6.


0.6.1
=====

Upgrading
---------
    - We try to keep minor versions 100% compatible (data format,
      commitlog format, network format) within the major series, but
      we introduced a network-level incompatibility in 0.6.1.
      Thus, if you are upgrading from 0.6.0 to any higher version
      (0.6.1, 0.6.2, etc.) then you will need to restart your entire
      cluster with the new version, instead of being able to do a
      rolling restart.


0.6.0
=====

Features
--------
    - row caching: configure with the RowsCached attribute in
      ColumnFamily definition
    - Hadoop map/reduce support: see contrib/word_count for an example
    - experimental authentication support, described under
      Authenticator in storage.conf

Configuraton
------------
    - MemtableSizeInMB has been replaced by MemtableThroughputInMB which
      triggers a memtable flush when the specified amount of data has 
      been written, including overwrites.
    - MemtableObjectCountInMillions has been replaced by the
      MemtableOperationsInMillions directive which causes a memtable flush
      to occur after the specified number of operations.
    - Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by
      BinaryMemtableThroughputInMB.
    - Replication factor is now per-keyspace, rather than global.
    - KeysCachedFraction is deprecated in favor of KeysCached
    - RowWarningThresholdInMB added, to warn before very large rows
      get big enough to threaten node stability

Thrift API
----------
    - removed deprecated get_key_range method
    - added batch_mutate meethod
    - deprecated multiget and batch_insert methods in favor of
      multiget_slice and batch_mutate, respectively
    - added ConsistencyLevel.ANY, for when you want write
      availability even when it may not be readable immediately.
      Unlike CL.ZERO, though, it will throw an exception if
      it cannot be written *somewhere*.

JMX metrics
-----------
    - read and write statistics are reported as lifetime totals,
      instead of averages over the last minute.  average-since-last
      requested are also available for convenience.
    - cache hit rate statistics are now available from JMX under
      org.apache.cassandra.db.Caches
    - compaction JMX metrics are moved to
      org.apache.cassandra.db.CompactionManager.  PendingTasks is now
      a much better estimate of compactions remaining, and the
      progress of the current compaction has been added.
    - commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog
    - progress of data streaming during bootstrap, loadbalance, or other
      data migration, is available under 
      org.apache.cassandra.streaming.StreamingService.
      See http://wiki.apache.org/cassandra/Streaming for details.

Installation/Upgrade
--------------------
    - 0.6 network traffic is not compatible with earlier versions.  You
      will need to shut down all your nodes at once, upgrade, then restart.



0.5.0
=====

0. The commitlog format has changed (but sstable format has not). 
   When upgrading from 0.4, empty the commitlog either by running 
   bin/nodeprobe flush on each machine and waiting for the flush to finish,
   or simply remove the commitlog directory if you only have test data.
   (If more writes come in after the flush command, starting 0.5 will error
   out; if that happens, just go back to 0.4 and flush again.)
   The format changed twice: from 0.4 to beta1, and from beta2 to RC1.

.5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist
   in a cluster of 0.4 nodes or vice versa; you must upgrade your
   whole cluster at the same time.

1. Bootstrap, move, load balancing, and active repair have been added.
   See http://wiki.apache.org/cassandra/Operations.  When upgrading
   from 0.4, leave autobootstrap set to false for the first restart
   of your old nodes.

2. Performance improvements across the board, especially on the write
   path (over 100% improvement in stress.py throughput).

3. Configuration:
     - Added "comment" field to ColumnFamily definition.
     - Added MemtableFlushAfterMinutes, a global replacement for the 
       old per-CF FlushPeriodInMinutes setting
     - Key cache settings

4. Thrift:
     - Added get_range_slice, deprecating get_key_range



0.4.2
=====

1. Improve default garbage collector options significantly --
   throughput will be 30% higher or more.



0.4.1
=====

1. SnapshotBeforeCompaction configuration option allows snapshotting
   before each compaction, which allows rolling back to any version
   of the data.



0.4.0
=====

1. On-disk data format has changed to allow billions of keys/rows per
   node instead of only millions.  The new format is incompatible with 0.3;
   see 0.3 notes below for how to import data from a 0.3 install.

2. Cassandra now supports multiple keyspaces.  Typically you will have
   one keyspace per application, allowing applications to be able to
   create and modify ColumnFamilies at will without worrying about
   collisions with others in the same cluster.

3. Many Thrift API changes and documentation.  See 
   http://wiki.apache.org/cassandra/API

4. Removed the web interface in favor of JMX and bin/nodeprobe, which
   has significantly enhanced functionality.

5. Renamed configuration "<Table>" to "<Keyspace>".

6. Added commitlog fsync; see "<CommitLogSync>" in configuration.



0.3.0
=====

1. With enough and large enough keys in a ColumnFamily, Cassandra will
   run out of memory trying to perform compactions (data file merges).
   The size of what is stored in memory is (S + 16) * (N + M) where S
   is the size of the key (usually 2 bytes per character), N is the
   number of keys and M, is the map overhead (which can be guestimated
   at around 32 bytes per key).
   So, if you have 10-character keys and 1GB of headroom in your heap
   space for compaction, you can expect to store about 17M keys
   before running into problems.
   See https://issues.apache.org/jira/browse/CASSANDRA-208

2. Because fixing #1 requires a data file format change, 0.4 will not
   be binary-compatible with 0.3 data files.  A client-side upgrade
   can be done relatively easily with the following algorithm:
     for key in old_client.get_key_range(everything):
         columns = old_client.get_slice or get_slice_super(key, all columns)
     new_client.batch_insert or batch_insert_super(key, columns)
   The inner loop can be trivially parallelized for speed.

3. Commitlog does not fsync before reporting a write successful.
   Using blocking writes mitigates this to some degree, since all
   nodes that were part of the write quorum would have to fail
   before sync for data to be lost.
   See https://issues.apache.org/jira/browse/CASSANDRA-182

Additionally, row size (that is, all the data associated with a single
key in a given ColumnFamily) is limited by available memory, because
compaction deserializes each row before merging.

See https://issues.apache.org/jira/browse/CASSANDRA-16